





Journal 


of Speech and Hearing 


Research 


June 1960 VOLUME 3 © NUMBER 2 


A Factorial Study of Speech Perception 


LAWRENCE N. SOLOMON, JOHN C. WEBSTER, AND JAMES F. CURTIS 


Cognitive Abilities of Deaf Children 


JOSEPH ROSENSTEIN 


Electrodermal Responses of Deaf Children 
WILLIAM W. GRINGS, EDGAR L. LOWELL, AND KONOLD R. HONNARD 


Auditory Discrimination Learning 
by Aphasic and Nonaphasic Children 


LILLIAN F. WILSON, DONALD G. DOEHRING, AND IRA J. HIRSH 


Visual Spatial Memory in Aphasic Children 
DONALD G. DOEHRING 


Vocal Pitch Variation Related to Changes in Vocal Fold Length 
HARRY HOLLIEN 


Measurements of the Vocal Folds during Changes in Pitch 
HARRY HOLLIEN AND G. PAUL MOORE 


Reliability of Language Measures and Size of Language Sample 
FREDERIC L. DARLEY AND KENNETH L. MOLL 


Articulatory Competency and Reading Readiness 
CARL H. WEAVER, CATHERINE FURBEE, AND RODNEY W. EVERHART 


Extensional Definition and Attitude toward Stuttering 
LONNIE L. EMERICK 


Infant Speech: Effect of Systematic Reading of Stories .............. 
ORVIS C. IRWIN 


Several Procedures for Scaling Articulation 
DOROTHY SHERMAN AND WALTER L. CULLINAN 








The American 


Speech and Hearing 


Association 


OFFICERS 
President 


Stanley Ainsworth, Ph.D. 
University of Georgia 


Executive Vice-President 
Jack Matthews, Ph.D. 
University of Pittsburgh 


Vice-President 
Jack L. Bangs, Ph.D. 
Houston Speech and Hearing Center 


Editor of the Association 
Wendell Johnson, Ph.D. 
University of Iowa 


OFFICERS-ELECT 


President-Elect 
G. Paul Moore, Ph.D. 
Northwestern University 


Vice-President-Elect 
Duane C. Spriestersbach, Ph.D. 
Jniversity of Iowa 


COUNCIL 

The Officers and the following 
Councilors: 

George A. Kopp, Ph.D. (1960) 
Oliver Bloodstein, Ph.D. (1960-62) 
William G. Hardy, Ph.D. (1960-62) 
Ira J. Hirsh, Ph.D. (1958-60) 

Ruth B. Irwin, Ph.D. (1960-63) 
James F. Jerger, Ph.D. (1959-61) 
Hayes A. Newby, Ph.D. (1960-63) 
Wilbert L. Pronovost, Ph.D. (1958-60) 
Dean E. Williams, Ph.D. (1959-61) 


EXECUTIVE SECRETARY 
Kenneth O. Johnson, Ph.D. 





The Journal 
of Speech and Hearing 
Research 


EDITOR 
Dorothy Sherman, Ph.D. 


ASSISTANT TO THE EDITOR 
Dorothy W. Moeller 


STATISTICAL CONSULTANT 
Leonard S. Feldt, Ph.D. 


ASSOCIATE EDITORS 

Oliver Bloodstein, Ph.D. 
Arthur S. House, Ph.D. 
James F. Jerger, Ph.D. 

D. E. Morley, Ph.D. 
Hildred Schuell, Ph.D. 
Arnold M. Small, Ph.D. 
William R. Tiffany, Ph.D. 
John C. Webster, Ph.D. 
Joseph M. Wepman, Ph.D. 


ASSISTANT EDITORS 
Kenneth L. Moll, M.A. 
Martin A. Young, M.A. 


DEPARTMENT EDITORS 
Ernest H. Henrikson, Ph.D. 
Book Reviews 


Martin F. Palmer, Sc.D. 
Records 


BUSINESS MANAGER 
Kenneth O. Johnson, Ph.D. 


APPLICATIONS FOR MEMBERSHIP SHOULD BE ADDRESSED TO THE Executive SECRETARY 











ao 





| 
| 
| 











a 





A Factorial Study of Speech Perception 


LAWRENCE N. SOLOMON 


JOHN C. WEBSTER 


JAMES F. CURTIS 


Several ‘factor analysis’ studies have 
been carried out to isolate basic audi- 
tory abilities. Karlin (6) found abilities 
or factors such as pitch-quality dis- 
crimination, loudness discrimination, 
memory span, and synthesis analysis. 
Harris (5), working on ‘sonarmen se- 
lection batteries,’ found that loudness 
discrimination itself can be subdivided 
into pure-tone and a more complex 
loudness factor and also found melodic 
memory, sonar performance, complex 
noise discrimination, pitch, and time- 
intensity factors. Hanley (3), in a fac- 
torial investigation of the domain of 
speech perception specifically, identi- 
fied such dimensions as verbal facility, 
tonal detection, voice memory, resist- 
ance to distortion, resistance to mask- 
ing, unpleasantness, synthesis, and a 





Lawrence N. Solomon (Ph.D., University 
of Illinois, 1954) is Assistant Professor of 
Psychology, California Western University. 
John C. Webster (Ph.D., University of Iowa, 
1953), Head, Auditory Detection and Com- 
munications Section, Human Factors Division, 
U.S. Navy Electronics Laboratory, San Diego, 
is now at Cambridge University and with the 
Applied Psychology Research Unit of the 

edical Research Council, Cambridge, Eng- 
land, on a year’s Senior Postdoctoral National 
Science Foundation Fellowship. James F. Cur- 
tis (Ph.D., University of Iowa, 1942) is Pro- 
fessor and Head, Department of Speech 
Pathology and Audiology, University of ne. 
This article reports research done in the 
Navy Electronics Laboratory. 


Volume 3, No. 2 101 


separate factor for the Seashore Tests 
of Musical Talent. 

Many other studies have found com- 
plexities where noninteracting unitary 
auditory or acoustic traits might have 
been expected, for example, interactions 
among speech sounds themselves (7) 
and have failed to find high correlations 
between seemingly related abilities, for 
example, low correlations between 
pure-tone audiometry and speech re- 
ception tests (1). This abundance of 
auditory factors (3, 5, 6), lack of cor- 
relation between apparently related 
abilities (1), and influence of one sound 
on an adjacent sound (7) indicate that 
speech perception is not yet entirely 
understood. Hanley (3), for example, 
expected to find some positive relation- 
ship between verbal facility and speech 
perception. He hypothesized that rec- 
ognition, detection, and synthesis of 
speech material under difficult condi- 
tions of reception should be facilitated 
for the subject with an extensive vocab- 
ulary since the stimulus material would 
be more familiar and meaningful than 
it would be to the subject with a limited 
vocabulary. This assumption was not 
borne out. He did find a verbal facility 
factor involved in speech perception, 
but found also that this factor was only 
minimally related to other factors in- 
volved in the same behaviour. It was 


June 1960 





102 Journal of Speech and Hearing Research 


TaBLE 1. The 39-test battery and the loadings (no decimals) on the factors listed in the last column. 
Roman numerals in parentheses are other factors with test loadings of 40 or greater; Arabic numerals 


in parentheses are other factors with test loadings from 20 through 39. 











Tests Loadings Factor 
Seashore 
Pitch 60 (1) IV 
Loudness 17 (1, 2) IV 
Time 35 (1, 2) IV 
Rhythm 44 (8) IV 
Tonal Memory 75 (1) IV 
Silent Time 34 IV 
Speech 
No Distortion 
Vocabulary 56 (1, 7, 8) IV 
Interrupted 
Vocabulary 52 IV 
Sentence* 35 (1, 7, VD IV 
Noise Masked 
Sentence 22 (2, 4, VI) I 
Spondeet 01 (II, 7, 8) I 
Wacot 60 (2, 4 I 
PBt 50 (2) I 
Nonsenset 45 (3, VID I 
Clipped 
Sentence*t 74 I 
Spondeet 69 (7) I 
Wacot 67 (4, 6) I 
PB 61 (5) I 
Nonsenset 54 (IV, VII) I 
Low Pass 
PB 45 (VI) I 
High Pass 
PB 34 (2, III, 5, 6) I 
Reverberation 
Sentence* 23 (2, 3, 4, 5, 6, 7) I 
PB 39 (8, 4) I 
Reading 
Sentence 67 (1) II 
Stuttering 
Sentence 58 (I, III) II 
Limerick 
Sentence 73 (5) II 
Threshold 
Sentence 60 (2, 5) Iil 
Spondee 44 (I, 2, 5) Ill 
PB 18 (1, II, 4. 5, 8 Ill 
Nonsense 30 (I, 5, IV) iil 
Audiometer 
Group 
ig 61 lll 
1000+ 80 Ill 
4000* 46 (VIII, 5) Ill 
Individual 
500 68 bd 
1000 79 V 
2000 71 Vv 
4000 64 (VIIT) Vv 
Memory 
Female Voices 
Prose 42 VII 
Male Voices 
Prose 27 VII 








* Five key-word sentences (not question-answer type). 
attery; all other tests had been used by Hanley (3), 


Tt New tests in this b 











a 














Solomon, Webster, Curtis: Speech Perception 


concluded by Hanley that the hypothe- 
sized relationship between verbal facil- 
ity and speech perception failed to 
exhibit itself in his study because his 
subject population (university stu- 
dents) was too homogeneous with 
regard to education and verbal facility. 
As Hanley points out, the restricted 
range of talent in this ability most likely 
served to attenuate the intercorrelations 
between verbal facility and other fac- 
tors involved in speech perception and, 
thus, led to a negative conclusion with 
regard to the hypothesized relationship. 
The object of the present study is, 
therefore, (a) to test the general valid- 
ity of Hanley’s results on a more 
heterogeneous, nonuniversity popula- 
tion of subjects, and in particular (b) 
to test whether verbal facility is highly 
correlated with speech perception. 


Method and Procedure 


Subjects. The subjects employed in 
this study were 90 male Navy enlisted 
recruits at the Naval Recruit Training 
Center, San Diego, California; 21 were 
age 17, 36 were 18, 28 were 19 or 20, 
and five were 21 through 24; 49 had 
finished high school, only five had any 
college at all, and eight had finished 
only seventh or eighth grade. 

Apparatus. All tests employed in this 
study (with the exception of the four 
individual audiometer tests) were tape 
recorded and were played to the sub- 
jects over a multiple-headset network 
from a tape recorder. The tests were 
presented monaurally to the subject’s 
better ear over Permoflux PDR-8 ear- 
phones in doughnut cushions. The 
subjects were tested in groups. The 
maximum size of these groups was 36. 


Room. The room in which the tests 


103 


were given was located in a converted 
wooden barracks building at the San 
Diego Naval Training Center. It was 
selected as the quietest among those 
available but was neither sound treated 
nor isolated from outside noises. All 
tests except threshold tests were played 
at comfortably loud listening levels. 
Noise-excluding doughnut cushions, 
NAF-48490, were used on the ear- 
phones, but even so threshold tests may 
have been somewhat masked by room 
noise. 


Test Battery. Of the 39 tests used, 24 
were used by and furnished by Hanley 
(3). The tests (Table 1) were of five 
types: Seashore Tests of Musical Tal- 
ent, individual clinical audiometer tests 
using a Maico E-2 audiometer, the 
Navy Electronics Laboratory Group 
Audiometer Test (9), voice memory 
tests (prose passages were read by 14 
female and 14 male speakers and the 
listener judged whether a particular 
voice was one heard earlier), and a large 
battery of speech perception tests. 

The speech perception tests included: 

(a) two vocabulary tests in which 
40 sets of five synonyms appeared and 
the listener underscored the synonym 
he heard; 

(b) two types of Harvard sentence 
tests: question-answer types as ‘What 
letter comes after D?’ and five key- 
word types (marked by asterisks in 
Table 1), as ‘Deal the cards from the 
top, you bully’; 

(c) spondee words (two equally ac- 
cented syllables, as ‘railroad’) ; 

(d) multiple choice word tests, de- 
veloped by Haagen (2) and often re- 
ferred to as Waco tests, in which 18 
three-word items were spoken as ‘top, 
cool, storage,’ and the listener marked 








104 Journal of Speech and Hearing Research 


which of four words (as ‘pop, top, hop, 
prop’) he heard; 

(e) phonetically balanced (PB) 
words; and 

(f) nonsense syllables of CVC type. 

Either these speech perception tests 
were heard in the clear or intelligibility 
was degraded in one of the following 
10 ways: interrupted (half-speech, half- 
silence alternating seven times per sec- 
ond); white noise masked; infinitely 
peak clipped; passed through an 850- 
cps low-pass filter; passed through a 
3400-cps high-pass filter; recorded in a 
reverberant (3- to 4-second time con- 
stant) room; heard in the presence of 
simultaneous reading, stuttering, limer- 
ick reading; or reduced in level until 
threshold was reached. 


Results and Discussion 


Treatment of the Data. Pearson prod- 
uct-moment correlation cofficients 
were computed for all possible pairs of 
tests in the battery, generating a 39 by 
39 intercorrelation matrix. Eight or- 
thogonal factors were extracted from 
this matrix by means of the complete 
centroid method (8), and were rotated? 
to orthogonal simple structure by means 
of the Varimax method (4, p. 10). 

Interpretation of Factors. The results 
of the orthogonal rotation will be dis- 
cussed with the aid of Table 1 which 
has an entry for each of the 39 tests, 
with tests that loaded highly on a given 
factor grouped together. Thus factor 
IV is listed first even though in terms 
of variance reduction it followed fac- 
tors I, II, and III. For each test the 


*The rotation was conducted by Dr. Charles 
Wrigley and Dr. Henry F. Kaiser on the 
IBM 701 computer at the University of Cali- 
fornia Computer Center, with the support of 
the National Science Foundation, 


loading on the primary factor, listed 
at the left, is tabulated and reference 
is given by Roman numerals to all other 
factors with loadings of 40 or greater 
and by Arabic numerals to all other fac- 
tors with loadings of from 20 through 
39. Thus the noise-masked nonsense- 
syllable test had a loading of 45 on 
Factor I, a loading between 20 and 39 
on Factor III, and a loading of 40 or 
greater on Factor VII. Each factor will 
now be discussed in turn. 

Factor I appears to reflect the ability 
to perceive verbal material which has 
been distorted by clipping or noise 
masking. Five of the tests with the 
highest loadings on this factor are tests 
of speech perception under conditions 
of clipping. Three noise-masked tests 
and two threshold tests are also highly 
loaded on this factor. Hanley (3) found 
a similar resistance-to-distortion factor 
which loaded highly on reverberant 
PBs, reverberant sentences, and low- 
and high-passed PBs. 

Factor II appears to reflect resistance 


‘to distraction or to meaningful inter- 


ference of some kind. Hanley (3) 
found a somewhat similar factor and 
called it resistance to masking. 

Factor III is highly loaded on group 
tests of hearing acuity and speech 
threshold tests and will be called an 
acuity factor. 

Hanley’s tonal detection factor also 
included audiometer results at all fre- 
quencies and four speech threshold 
tests. His battery included individual 
audiometry but no group tests of hear- 
ing acuity. The battery in the present 
study included both individual and 
group audiometer tests but the loadings 
appear separately in two factors, so be- 
fore discussing Factor III further, Fac- 
tor V will be discussed. 
































Solomon, Webster, Curtis: Speech Perception 


Factor V appears to be a measure of 
hearing loss as measured by individual 
pure-tone audiometer tests. It would 
appear that this factor should not have 
existed separately but should have ap- 
peared together with Factor III. Or, 
to agree with Hanley’s results, this fac- 
tor, as well as Factor III, should have 
high loadings for speech threshold tests. 
Low loadings did occur, such as 21, 22, 
23, and 32 (where Factor III loadings 
were 30, 44, 60, and 18). 

In both populations the subjects had 
normal hearing, Hanley’s by selection, 
that of the present study by basic Navy 
physical standards. It is somewhat sur- 
prising therefore to find an acuity or 
hearing loss factor in either study, since 
there is no appreciable spread of pure- 
tone hearing loss for either population. 
If the range of hearing loss scores is 
restricted, the correlation coefficients 
are low and hence no strong intercorre- 
lations or factors would be anticipated. 
Actually there is more similarity among 
scores on different frequencies of the 
group or the individual test than be- 
tween the same frequency on the two 
different tests. This indicates that the 
psychophysical test method used, which 
is radically different between individual 
and group hearing tests, may contribute 
to test variance more than the actual 
pure-tone hearing losses. In other words, 
Factor V instead of being a hearing 
loss factor could equally well be con- 
sidered an individual test or an audiom- 
etry factor, while Factor III could be 
considered a group test acuity factor. 
Since Factor III included some speech 
threshold tests, it is called here an acuity 
factor, and Factor V an individual audi- 
ometry factor. 

Factor III of this study does, then, 
relate to Hanley’s tonal detection fac- 


105 


tor; both factors are loaded highly with 
threshold tests for tones and speech. 
In this study the group tonal tests are 
more similar to the group speech tests 
than the group tonal tests are to the 
individual tonal (audiometer) tests. 
Hence a new factor, Factor V (indi- 
vidual audiometry) emerged. 

Factor VIII also is to be considered 
in the discussion of pure-tone hearing 
loss. This factor, although it contains 
only two tests, appears to be easily in- 
terpretable as a 4000-cps hearing loss 
factor. It is generally agreed that within 
this age group impairment at 4000 cps 
reflects noise-induced hearing loss. At 
4000 cps even young adult males have a 
spread of hearing loss, and quite logi- 
cally this shows up on both individual 
and group tests. This is the only 
frequency where the original 39 x 39 
correlation matrix showed greater cor- 
relation between group and individual 
tests at a given frequency than between 
different frequencies on the same test, 
either group or individual. 

Factor IV appears to be a: verbal- 
facility-Seashore factor. Hanley (3) 
found a pure Seashore factor and a pure 
verbal facility factor. Again the differ- 
ence in populations should be stressed 
as regards their facility and familiarity 
with tests in general and the motivation 
for taking this particular battery of 
tests. For the recruit population there 
is indeed a verbal facility factor tied 
in with certain abstract-type tests as 
the Seashore battery and the nonsense- 
word tests. Subjects who understood 
the abstractions did well on these tests, 
the others did not. For college majors 
in psychology and speech, for whom 
special and abstract tests are quite com- 
mon, no relationships between verbal 
facility and abstract tests appear to 





106 Journal of Speech and Hearing Research 


exist. Hanley found two distinct fac- 
tors, verbal facility and Seashore. This 
study found one combining the two. 


It was because Hanley (3) found no 
relationship between verbal facility and 
any other tests in his battery that the 
present study was inaugurated. No ex- 
tremely high loadings on Factor I (re- 
sistance to distortion or the ability to 
perceive speech under adverse condi- 
tions) were found for tests loading 
highly on Factor IV. However, of the 
nine tests lumped together in Factor 
IV, six have loadings between 20 and 
40 on Factor I. Similarly of the 13 tests 
(not noise-masked spondees) lumped 
together in Factor I, five have loadings 
between 20 and 40 on Factor IV, one 
test loads 42, and one 19 (clipped sen- 
tences). This implies some relation 
between the  verbal-facility-Seashore 
factor and speech perception tests 
(Factor I), a stronger relationship at 
least than Hanley (3) found. 


Factor VI is difficult to interpret 


since so few tests loaded on it. The - 


presence of two sentence tests, how- 
ever, might give an indication that this 
is some kind of synthesis factor, where 
the subject is required to synthesize 
the poorly received auditory material 
in order to make some meaningful sen- 
tence out of it and, hence, respond 
correctly to the test item. This does not 
however compare at all with Hanley’s 
synthesis factor. 


Factor VII cannot be succinctly 
labeled. It loaded highly on nonsense 


syllable and voice memory tests. Hanley, 


found a voice memory factor that 
showed high loadings for male and 
female voices. His battery did not con- 
tain nonsense syllable tests, however, 
so no speculations can be made as to 


whether nonsense materials may have 
had high loadings on his voice memory 
factor. 


Conclusions 


Considering the divergence between 
the two populations, the differences in 
the test batteries, and the type of 
rotated solution (orthogonal vs. 
oblique), the results of this study and 
the results of Hanley’s study are reason- 
ably similar. The factor labeled by 
Hanley as resistance to distortion was 
identified in this study and the agree- 
ment between the two studies was 
good. However, the battery of the 
present study included eight speech 
tests (and a group audiometer test) not 
included in Hanley’s battery and the 
present resistance-to-distortion factor 
included seven of these eight new 
speech tests. The indication is that the 
ability to understand filtered, reverber- 
ant, interrupted, clipped, and _noise- 
masked speech is a single capability. 

The agreement between Hanley’s re- 
sistance to masking and the present 
resistance-to-distraction factor is good, 
although in the present study there 
were higher loadings on distraction- 
type tests than on noise-masked tests. 

In contrast to Hanley’s tonal detec- 
tion factor that included both speech 
and tonal threshold tests, this study 
found two related factors, one that con- 
tained group speech and group tonal 
threshold tests, and the other, individual 
pure-tone audiometer tests. Hanley’s 
test battery contained no group tonal 
threshold test. 

The present study found a 4000-cps 
hearing loss factor that was measured 
on both group and individual audiom- 
eter tests. 








--—~ 


























Solomon, Webster, Curtis: Speech Perception 107 


Hanley found a verbal facility and 
a Seashore factor; this study found one 
factor that combined the two. In Han- 
ley’s study speech perception tests did 
not load highly on either his verbal 
facility or Seashore factor. Relatively 
high loadings (greater than 0.20) were 
found for speech perception tests on 
the present combined verbal-facility- 
Seashore factor. 


Hanley found voice memory, syn- 
thesis, and unpleasantness factors. This 
study found two rather undefinable 
factors. One factor may have been sim- 
ilar to Hanley’s voice memory factor 
but only vaguely so. The other could 
be called a synthesis factor but was not 
closely related to Hanley’s synthesis 
factor. 


Summary 


Speech and tonal tests were admin- 
istered to 90 male Navy recruits. In- 
cluded were verbal facility tests and 
intelligibility tests of distorted speech, 
noise-masked speech, filtered speech, 
speech at threshold, speech in the pres- 
ence of distracting sounds. Included 
also were pure-tone audiometer tests 
and the Seashore Tests of Musical Tal- 
ent. Pearson rs were computed for all 
possible pairs of tests. The resulting 
intercorrelation matrix was factor ana- 
lyzed by the complete centroid method 
and rotated to orthogonal simple struc- 
ture by the Varimax method. Important 


among the eight factors extracted were: 
resistance to distortion, resistance to 
distraction, acuity, individual audiom- 
etry, 4000-cps hearing loss, and verbal- 
facility-Seashore. 

Of the 39 tests in this study, 28 had 
been used in a previous factorial study 
of college students. Results agree rea- 
sonably well, differences can be ex- 
plained by differences between popu- 
lations and analysis methods (oblique 
vs. orthogonal factor rotations). 


References 


1. Carnart, R., Speech reception in rela- 
tion to pattern of pure tone loss. J. 
Speech Hearing Dis., 11, 1946, 97-108. 

2. Haacen, C. H., Intelligibility measure- 
ment: Twenty-four word multiple-choice 
tests. Psychological Corp., New York, 
Sept., 1945, OSRD Rept. No. 5567. (PB 
12050) 

3. Hantey, C. N., Factorial analysis of 
speech perception. J. Speech Hearing 
Dis., 21, 1956, 76-87. 

4. Kaiser, H. F., An analytic rotational 
criterion for factor analysis. Univ. Cali- 
fornia mimeographed abstract, 1956. 

5. Harris, J. D., A search toward the 
primary auditory abilities. A Decade of 
Basic and Applied Science in the Navy 
ONR 2, 1957, 244-254. (ASTIA 144524) 

6. Karun, J. E., A factorial study of audi- 
tory function. Psychometrika, 7, 1942, 
251-279. 

7. SHERMAN, Dorotuy H., The influence of 
vowels on recognition of adjacent con- 
sonants. J. Speech Hearing Dis., 17, 1952, 
198-212. 

8. Tuurstone, L. L., Multiple Factor Analy- 
sis. Chicago: Univ. Chicago Press, 1947. 

9, Wesster, J. C.; Development and use of 
the NEL recorded warble-tone hearing 
test: Part I. U. S. Navy Electronics 
Laboratory Report No. 546, 1954. 








Cognitive Abilities of Deaf Children 


JOSEPH ROSENSTEIN 


A recent review and analysis (12) of 
cognition in deaf and hearing children 
reveals discrepancies both in the results 
of tests and, even where the results 
agree, in their interpretations. There 
is also disagreement with respect to 
what it is that various tests are pur- 
ported to measure. The inferior per- 
formance of deaf children on con- 
ceptual tasks in some cases has been 
attributed exclusively to the deafness 
(8, 9). Some deaf children, however, 
do not exhibit the ‘conceptual restric- 
tion’ (10) or the reduction in the ability 
to think abstractly. It has been sug- 
gested rather that this inferiority may 
be attributed to the limited language 
experience of the deaf in these studies 
(11). 

Conceptualization progresses in chil- 
dren from perception through abstrac- 
tion to generalization (13). These com- 
ponents have been sampled in deaf 
subjects, but in diverse contexts or 
in isolation. In the present study, be- 
haviors are sampled at several steps 
along the perception-abstraction-gen- 





Joseph Rosenstein (Ph.D., Washington 


University, 1959) is Research Associate, Cen- 


tral Institute for the Deaf, St. Louis. This 
research was partially supported by a grant 
(B-1718) from the National Institute of 
Neurological Diseases and Blindness of the 
National Institutes of Health, and is based in 
part on a doctoral dissertation completed 
under the direction of Dr. Ira J. Hirsh. 


Volume 3, No. 2 


108 


eralization progression to determine 
where the inferiority lies, if, in fact, 
there is inferiority. It was hypothesized 
that when linguistic requirements are 
eliminated or minimized, deaf children 
would not differ from hearing children 
in both perceptual and more complex 
cognitive behaviors. 


Procedure 


Subjects. The subjects were 60 
orally-trained deaf and 60 hearing 
children. The 120 children came from 
six schools: 20 each from a parochial, 
a private, and a public school for hear- 


‘ing children; and 20 each from a 


parochial, a private, and a public school 
for deaf children. Ten of the 20 sub- 
jects in each of the six schools were 
eight year olds, and 10 were 12 year 
olds. The 120 subjects thus were di- 
vided into deaf or hearing groups, two 
age groups, and three type-of-school 
groups, permitting a 2 x 2 x 3 factorial 
design. 

Within each age level, the hearing 
and deaf groups were randomly chosen 
from the available school populations. 
Auxiliary descriptive data (mean years 
in school, mean IQ, and mean hearing 
loss) for the separate groups are pre- 
sented in Table 1. The mean age of 
the young group was eight years, six 


June 1960 





ew. 


a we SS. 


= 





Rosenstein: Deaf Children’s Cognitive Abilities 109 


TaBLE 1. Mean chronological age in years and months; mean years in school; mean hearing loss, in 
decibels, for the deaf children; mean IQ and IQ measuring instrument. Results are arranged by schools 
for deaf subjects and for hearing subjects separately. 











Group School Age Years in Hearing IQ IQ Test 
School Loss 
Deaf 
Parochial 8-6 4.00 97.0 114.5 Columbia Mental Maturity 
Private 8-6 3.60 87.5 117.3 Advanced Performance 
Public 8-6 3.05 89.9 94.6 Grace Arthur, Form II 
Means 8-6 3.55 91.4 108.8 
Parochial 12-3 7.95 92.7 122.5 Columbia Mental Maturity 
Private 12-3 8.00 89.5 114.6 Advanced Performance 
Public 12-5 6.85 83.0 98.8 Grace Arthur, Form II 
Means 12-4 7.60 88.4 111.9 
Hearing 
Parochial 8-6 3.25 113.4 Kuhlmann—Anderson 
Private 8-6 3.60 114.9 Stanford-Binet 
Public 8-6 3.35 105.8 Kuhlmann~Anderson 
Means 8-6 3.40 111.4 
Parochial 12-4 7.20 113.38 Kuhlmann-Anderson 
Private 12-3 7.20 120.8 Stanford-Binet 
Public 12-3 7.25 104.2 Otis Beta 
Means 12-3 7.22 112.8 








months; of the old group, 12 years, 
3.5 months. All of the deaf subjects 
had sustained deafness prior to age two, 
had no known disorders other than 
deafness, and had a mean hearing loss 
of 89.9 db. Scholastic training was al- 
most equal for the groups: average 
years in school for the young deaf chil- 
dren was 3.55 years; for the young 
hearing children 3.40 years; for the old 
deaf groups 7.60 years; and for the old 
hearing groups 7.22 years. The mean IQ 
of the hearing group (112.07) closely 
approximated that of the deaf group 
(110.38), but this comparison is weak 
because the measures for IQ assessment 
differed from school to school. (The 
correlation of IQ with test performance 
is treated below.) 


Apparatus and Materials. Three non- 
verbal visually presented tasks were 
administered to individual subjects 
through a modified Rational Learning 


Apparatus, described by Bunch and 
Hagman (2), which provides for ex- 
perimental control of reward for cor- 
rect responses in multiple-choice situ- 
ations. 





° AW: . 




































Figure 1. Viewing apparatus: A, front or 
subject’s view; B, rear or experimenter’s 
view. 








110 Journal of Speech and Hearing Research 


RESPONSE 
JACKS 


ot OD | | 


was 


\7 





REWARD 


Dace? 








CONTROL 
SWITCHES. 


RESPONSE INDI- 
CATOR LIGHTS 


Ficure 2. Schematic diagram of the circuit 
of the viewing apparatus. Arrows correspond 
to those in Figure 1. 


A reproduction of the front, or sub- 
ject’s view, of the cabinet is shown in 
Figure 1A. A response is effected 
whenever the response plug is inserted 
into one of the eight response jacks. 
The green reward light is illuminated 
whenever that response is correct. Fig- 
ure 1B shows the apparatus from the 
rear, or experimenter’s view. The ex- 
perimenter controls the reward for 
any one of the eight possible alterna- 
tives by control switches. Correspond- 
ing indicator lights show the experi- 
menter which response has been made. 


Figure 2 indicates the schematic dia- - 


gram of the electrical circuitry. All 
of the stimuli were either drawn or 
mounted on slides and were projected 
from the rear of the cabinet onto the 
screen. 

Following are the three tasks re- 
quired of the subjects: perceptual dis- 
crimination, multiple classification, and 
concept attainment and usage. 

Perceptual Discrimination. The per- 
ceptual discrimination task involved the 
ability to perceive differences among 
objects with bidimensional characteris- 
tics. It was the subject’s task to choose, 
out of eight similar objects, that one 
object which differed from the remain- 
ing seven with respect either to color 


or form. The colors of red, blue, green, 
and yellow, and the forms of square, 
circle, and triangle were used, presented 
in random combinations of two colors 
or two forms. Five slides of discrimina- 
tions based on color differences were 
alternated with five slides of discrimi- 
nations based on form differences. 
Figure 3 shows the sample slide and 
two of the 10 test slides. No limit 
was imposed on the number of re- 





























SLIDE NO. 1 








AABA 











SLIDE NO. 


Ficure 3. Sample and representative slides 
for the perceptual discrimination task. 








Rosenstein: Deaf Children’s Cognitive Abilities 111 





© O 

















SLIDE NOM Oi) reo 
oa 
(jee 
A AA 
SLIDE NO.36 





RESPONSE-GUIDE CARD FOR WCST 





Figure 4. Representative slides and response- 
guide card for the multiple classification task 
(Wisconsin Card Sorting Task). 


sponses to a given slide. The reward 
was given when the correct response 
was made. 


Multiple Classification. The multiple 
classification task represents applica- 
tion of abstracted perceptual properties 
as guides to behavior. A concept may 
be regarded as an abstract idea when 
an identical perceptual element is em- 
bodied in successive instances (/4). 
The property of ‘blueness, for ex- 
ample, would be the concept for the 
series of two blue boxes, five blue bal- 
loons, and one blue chair. 


Test materials, instructions, and scor- 
ing for the Wisconsin Card Sorting 
Task (WCST), used in the multiple 
classification task, have been described 
by Berg (/) and Grant and Berg (5). 
In brief, 60 slides of a possible 64 were 
used. Each contained from one to four 
identical forms of a single color. Four 
colors and four forms were used: red, 


green, yellow, and blue; square, circle, 
triangle, and star. A single slide, for 
example, might contain two red circles, 
or four yellow triangles (see Figure 
4). A response-guide card displaying 
the four combinations not included 
among the 60 slides was inserted in a 
rack below the bottom four jacks on 
the apparatus (Figure 4). The only 
restriction on the randomized order 
of presentation of the 60 slides was 
that no single color, form, or number 
should follow itself immediately. The 
same order was presented to all sub- 
jects, and only one response per slide 
was permitted. 

The sequence color-form-number 
was arbitrarily chosen as the order of 
concepts to be rewarded. (A ‘concept’ 
is here defined as a dimension of display 
correctly identified 10 times in suc- 
cession.) When responses were given 
to the concept of color 10 times in 
succession, the concept was shifted, 
without the subject’s knowledge, to 
form, and responses to color were no 
longer reinforced by the reward light. 
In like manner, when form responses 
were observed 10 times in succession, 
the concept was again shifted; this time, 
to number. It was possible to record, 
in this manner, perseverative errors 
which indicated how many slides it 
took before a subject would relinquish 
a no-longer-reinforced response in fa- 
vor of the new concept by which he 
could guide his behavior. 

Concept Attainment and Usage. The 
third task, concept attainment and 
usage, concerned the attainment of con- 
cepts through a rational learning situ- 
ation. In this case, concepts are defined 
according to nonperceptual attributes 
of instances, and are labeled class con- 





112 Journal of Speech and Hearing Research 


cepts (14). An example of a class con- 
cept would be the response ‘means of 
transportation’ to the series of a sled, 
a rowboat, and a horse. 


Three series (lists), of five pictures 
each, served as stimuli for this task. 
A fourth list contained the class con- 
cept labels (in printed word form), 
members of which were displayed in 
the pictures in each of the three lists. 
This latter criterion list served as a 
measure of concept usage. The four 
lists were as follows: 


List 1 List 2 List 3 Criterion 
rabbit dog kitten animals 
airplane bird kite can fly 
cake bread corn foods 
car bus* train can ride 
jacket shoe hat can put on 


*As it happened, bus was initially presented before 
bird for all subjects; hence no confusions arose 
between the concepts of Transportation and Air- 
borne from list one to list two. 


For concept attainment, the first list 
was presented in randomized succes- 
sions, until subjects had learned, by 
trial and error, to associate to a cri- 
terion each picture in the list with 
one of five arbitrarily assigned num- 
bers. (Since the apparatus carried eight 
numbers, three numbers, of ‘course, 
were not used.) The second list was 
associated in the same manner, with 
the assigned numbers in this list corre- 
sponding to those assigned to pictures 
(members) of the same class in the 
first list. Similarly, the third list was 
learned. In this manner, the number 
which matched with members of suc- 


cessive lists became the symbolic label’ 


for the class concept which was em- 
bodied. Decrements in error scores 
from list to list indicated the progress 
(attainment) of the concept as its 


members were associated with the ap- 
propriate number. 


The criterion list represents the 
verbal counterpart of the numerical 
(symbolic) labels for the class concepts 
in the first three lists. The ability to 
generalize the number to the correct 
label in the criterion list was taken 
as the index of concept usage. It will 
be noted that labels for the concepts 
in the criterion list were framed in such 
a way that they would be included 
within the vocabulary experience of 
both deaf and hearing children. 


Method. The battery was presented 
to subjects individually in a semi- 
darkened room in a quiet wing in 
each of the six schools. No response- 
guide cards were in the racks at the 
beginning of the testing period. Sub- 
jects were given the response plug, 
and the sample slide of the discrimina- 
tion task was flashed on the screen. 
Hearing subjects were given the fol- 
lowing instructions: ‘Look at the 


‘ screen. Put the plug into the hole that 


will make this light go on. (Pause) 
Which one will make the light go on? 
When you light the light you are 
correct. Light the light.’ For the deaf 
subjects, these same instructions were 
pantomimed, with the corresponding 
items (plug, jacks, reward light) indi- 
cated by gesture. For some of the older 
deaf subjects, who were more accus- 
tomed to actual verbal interchange, the 
simplified instructions: ‘Light the light. 
Which one do you think it is?’ were 
given by the experimenter. The pan- 
tomimed and gestured instructions for 
the deaf were considered equal to the 
verbal instructions given to hearing 
subjects. Neither group was placed 








va 














~- 








Rosenstein: Deaf Children’s Cognitive Abilities 113 


TaBLE 2. Means and standard deviations for total correct and total error réaponees for the 12 groups, 
8- and 12-year-old deaf and hearing children, by schools, on the multiple classification task (Wisconsin 


Card Sorting Task). 











Group School CA: 8 CA: 12 

Mean SD Mean SD 

Correct 
Parochial 28.4 9.21 38.7 5.04 
Deaf Private 34.6 9.19 34.8 7.63 
Public 34.9 8.70 37.8 6.13 
Parochial 34.9 8.86 34.7 5.88 
Hearing Private 32.2 7.39 35.7 5.42 
Public 34.0 5.18 30.7 6.68 

Errors 
Parochial 32.7 8.72 22.3 5.24 
Deaf Private 27.0 6.25 24.9 6.98 
Public 25.0 4.17 23.5 5.99 
Parochial 26.9 8.07 25.6 5.50 
Hearing Private 29.8 7.49 25.2 5.06 
Public 27.4 5.35 30.2 6.14 








either at an advantage or at a disad- 
vantage with respect to task instruc- 
tions. 

Upon completion of the discrimina- 
tion task, the response-guide card for 
the WCST was inserted and the in- 
structions ‘Light the light’ were reiter- 
ated. If the subject responded incor- 
rectly to the first few slides of the 
WCST, the experimenter indicated that 
the light had not become illuminated, 
directed the subject’s attention to the 
four stimuli on the response-guide 
card, and urged him to choose the 
most appropriate category for the on- 
coming stimulus slide. The remainder 
of the 60 WCST slides were projected 
with no further comment. 

For the concept attainment and 
usage task, the numbered response- 
guide cards were inserted in place, and 
a sample slide in which the number 
‘7’ was displayed, was flashed on the 
screen. Subjects had no difficulty in ap- 
prehending that their task was simply 
to match, and little difficulty was noted 


when subjects were then required to 
find and match the projected object 
with its number. Subjects were ad- 
vised to continue responding until the 
light was lit, and to remember it. 

The battery took from 30 to 50 min- 
utes to complete. Motivation appeared 
to be quite high for all subjects. If, 
for any reason, reluctance or hesitancy 
of response was displayed, the only 
additional comments made by the ex- 
perimenter for encouragement were 


& 
1 
1 


® rancomm 
a 
o 





$ id 


MEAN NUMBER TOTAL RESPONSES 











Ficure 5. Mean number correct responses 
and mean number errors on the multiple 
classification task (Wisconsin Card Sorting 
Task) for the 12 experimental groups, 8- 
and 12-year-old deaf and hearing children, 
by schools, 








114 Journal of Speech and Hearing Research 


‘Go ahead,’ ‘Think now,’ and ‘Light 
the light.’ All responses to all experi- 
mental stimuli were recorded. 


Results 


Perceptual Discrimination. Every 
subject attained a perfect score on the 
discrimination task, that is, responded 
correctly to each of the 10 slides on 
the first attempt. Although this task 
was not sufficiently sensitive to detect 
overall differences in perceptual dis- 
crimination, it did demonstrate that 
none of the deaf (or hearing) subjects 
had any difficulty in discriminating the 
color- or form-objects which served as 
stimuli for the second task. (The test 
also served as a color-blindness check; 
one subject was eliminated in this 
manner and was replaced.) 


Multiple Classification (WCST). The 
four scores that were obtained from 
the data on the WCST are total correct 
responses, total error responses, per- 
severative error responses, and number 
of concepts attained. 


Total Correct Responses and Total 
Errors. The mean scores and standard 
deviations for each of the 12 groups 


~ eo 


MEAN NUMBER 
2 


PERSEVERATIVE ERRORS 
2. 








> 








B 8] DIATE 


: % ES 
DF HRG OF _HRG 
8 YRS. 12 YRS. 





4 30 
z B 
< Bs 
= : % Be 
DF__HRG DF _HRG 
B YRS. 12 YRS. 


Figure 6. Mean number of perseverative er- 
rors and of concepts attained on the Wiscon- 
sin Card Sorting Task for the 12 groups, 8- 


- and 12-year-old deaf and hearing children, 


by schools. 


TaBLE 3. Means and standard deviations of perseverative error (raw and square-root transformed) 
scores for all groups, 8- and 12-year-old hearing and deaf children, by schools, on the Wisconsin Card 


Sorting Task. 











Group School CA: 8 CA: 12 
Mean SD Mean S. 
Raw Scores 
Parochial 7.6 6.23 7.6 2.20 
Deaf Private his 3.20 5.2 2.79 
Public 6.9 3.42 5.8 3.92 
Parochial Tol 5.45 4.1 1.81 
Hearing Private 6.6 4.65 6.7 2.88 
Public 9.4 7.02 4.6 2.65 
Transformed Scores 
Parochial 2.59 1.18 2.82 0.41 
Deaf Private 2.79 0.55 2.33 0.55 
Public 2.64 0.64 2.38 0.80 
Parochial 2.56 1.04 2.08 0.53 
Hearing Private 2.50 0.92 2.62 0.56 
Public 2.98 1.00 2.19 0.57 

















Rosenstein: Deaf Children’s Cognitive Abilities 115 


TasiE 4. Means and standard deviations of number of concepts attained on the Wisconsin Card 
Sorting Task for the 12 groups, 8- and 12-year-old hearing and deaf children, by schools. 











Group School CA: 8 CA: 12 
Mean SD Mean SD 
Parochial | Fy § 1.10 3.1 0.54 
Deaf Private 2.4 0.80 2.3 1.01 
_ Public 2.6 0.80 2.4 1.02 
Parochial 2.2 1.08 2.0 1.10 
Hearing Private 1.9 0.94 2.4 0.66 
Public 2.2 0.75 2.0 0.63 








for total correct and total error re- 
sponses are presented in Table 2, and 
are illustrated in bar-graph form in 
Figure 5. Inspection of these data 
reveals no consistent differences among 
the groups. Analyses of variance for 
both these scores also indicated no sig- 
nificant differences among the groups 
that might be attributed to age, hearing 
status, or school environment. 
Perseverative Errors. The mean 
scores and standard deviations for per- 
severative errors are presented in Table 
3, and are illustrated in the upper 
portion of Figure 6. It would appear 
that the younger groups, more than 
the older groups, tend to continue in 
their response to stimuli after they are 


Taste 5. Summary of analysis of variance for 
evaluation of data on mean number of concepts 
attained on the Wisconsin Card Sorting Task. 











Source af ms F 
Age (A) 1 1.20 1.87 
Hearing Status (H) 1 2.70 3.08 
Type of School (S) z 0 04 
AxH 1 0 8&4 
AxS8 2 1¢0  =1.82 
SxH 2 %.10 
AxSxH 2 3.23 3.68* 
Within Groups 108 0.88 
Total 119 








*Significant at the .05 level. 


no longer rewarded for those stimuli. 
Analysis of variance on the raw scores 
was not feasible, because the variances 
were not homogeneous according to 
Bartlett’s test (3). However, the raw 
scores were transformed to another 
scale, a square-root transformation 
(3), which permitted the analysis of 
variance. The means and standard de- 
viations of the transformed scores also 
are presented in Table 3. This analysis 
of variance indicated no statistically 
significant differences among _ the 
means of the transformed error scores. 


Number of Concepts Attained. The 
mean scores and standard deviations 
for the 12 groups for number of con- 
cepts attained in the WCST are pre- 
sented in Table 4, and illustrated in 
the lower portion of Figure 6. This 
figure reveals no great differences 
among the mean number of concepts 
attained, except perhaps that between 
the young and old parochial deaf chil- 
dren. There is no ready explanation 
for the relatively large difference be- 
tween deaf eight-year-old and deaf 
12-year-old parochial children and the 
negligible differences between the 
other eight- and 12-year-old groups. 
A large experimental error might be 
suspected but none such occurred. It 





116 Journal of Speech and Hearing Research 


TaBLE 6. Means and standard deviations of errors per list (Lists 1, 2, 3, and Criterion) of the concept 
attainment task for the 12 groups, 8- and 12-year-old deaf and hearing children, by schools. 








Group School List 1 List 2 





List 3 Criterion 
Mean SD Mean SD Mean SD Mean SD 
CA: 8 
Parochial 66.7 28.4 48.0 30.2 21.1 8.4 14.1 5.7 
Deaf Private 59.8 36.5 39.2 14.1 16.6 8.4 11.5 8.4 
Public 53.8 bag 37.0 17.1 15.8 7.2 11.3 6.0 
Parochial 80.2 59.6 52.3 38.5 30.3 23.0 15.0 ° 5.6 
Hearing Private 53.5 31.8 56.0 42.8 20.5 9.0 12.3 6.6 
Public 48.4 30.2 38.0 9.6 38.1 18.4 14.0 5.8 
CA: 12 
Parochial 43.3 19.9 29.0 12.8 21.1 9.0 11.9 fen 
Deaf Private 87.3 20.9 30.1 9.4 26.5 21.1 1127 8.4 
Public 42.4 27.0 34.4 10.8 13.1 11.1 13.8 9.5 
Parochial 40.6 33.2 39.2 26.1 26.5 14.4 14.6 5.8 
Hearing Private 30.3 13.0 33.3 24.8 17.2 6.4 10.1 6.2 
Public 51.0 31.7 30.1 12.5 22.2 14.3 10.2 7.8 








seems clear that except for this one of variance, summarized in Table 5, in- 
pair of groups no significant differ- dicates only one significant result, that 
ences occur in the data. The analysis for the A x S x H interaction. 





MEAN NUMBER ERRORS 

















1 » | ' \ 
= 12 YEAR OLDS 
—— 
= HRG. DF 
& d PAROCHIAL 
- @ O PRIVATE 
BO PUBLIC 




















List) LIST2 LIST3 CRITERION LIST! 


List2 LIST 3 CRITERION 


Ficure 7. Mean number of errors for 8-year-old and 12-year-old groups (deaf and hearing 
children, by schools) for the criterion trial and as a function of successive lists on the concept 


attainment and usage task. 





—e 


~— 


























Rosenstein: Deaf Children’s Cognitive Abilities 117 


Concept Attainment and Usage. 
Mean number of errors per list and 
standard deviations for lists one, two, 
and three for the young and old groups 
are presented in Table 6 and graphically 
in Figure 7. It appears that the learning 
of lists one and two is more difficult 
for the younger children than for the 
older groups. Error scores for list 
three, however, indicate that both 
young and old groups achieve the 
same level of performance; or, older 
children make fewer initial errors in 
their learning than do younger groups. 

Because these data were not homoge- 
neous with respect to variance, logarith- 
mic transformation of the error scores 
was performed prior to the analyses 
of variance (3). The analyses of these 
transformed scores again indicated no 
significant differences between deaf 
and hearing groups, between young 
and old children, or among types of 
school. 


Another way of viewing perform- 
ance in this task is to examine the 
number of trials to criterion needed for 
each list. The graphical representation 
of mean number of trials per list for 
the 12 groups resembles the same 
trends shown in Figure 7. Analyses 
of variance for trials-to-criterion scores 
were not performed, because the cor- 
relations between errors per list and 
trials to criterion per list were ex- 
tremely high. 


The mean errors for the single pres- 
entation of the criterion list attained 
by each group (Table 6) are also illus- 
trated in Figure 7. It is apparent that 


Some Pearson rs between these two 
variables, for example, were .98, .90, .86, .92, 
91, 


no differences exist among these groups, 
as was similarly demonstrated for cri- 
terion list errors by the analysis of 
variance which indicated no signifi- 
cant differences among the groups. 


Correlation with IQ Scores. The 
contribution of intelligence to per- 
formance, in those investigations deal- 
ing with almost any psychological 
ability in which comparisons are made 
between any two groups, is very often 
a point of contention or question. In 
the present study, there was no attempt 
to match the groups with respect to 
IQ scores primarily because of the 
obvious difficulty in finding a measure 
that would be equally applicable to 
deaf and hearing groups. Correlations 
between IQ scores for individual 
groups of subjects and certain of the 
battery scores, therefore, were com- 
puted. The measures correlated with 
group IQ scores were the correct re- 
sponses, perseverative responses, and 
the number of attained concepts for 
the WCST; and the errors for list 
one, list three, and the criterion list 
for the concept attainment task. The 
results of these computations are pre- 
sented below in order to demonstrate 
that the presumed relation between in- 
telligence and the conceptual activity 
here in question was not upheld. 

Only five of the resultant 72 corre- 
lations were found to be significant 
beyond the .05 level and none were 
significant beyond the .01 level. These 
five correlations are not all attached 
to any particular score, age group, or 
school group. If levels of significance 
beyond the .01 level had been reached, 
a more exacting analysis and explana- 
tion of these data would have been 








118 Journal of Speech and Hearing Research 


indicated. It does not appear that the 
IQ, in any given group, is correlated 
with performance on this test battery. 
This finding is consistent with that of 
Fey (4), who says that intelligence 
is not a major factor in determining 
the adequacy of performance on the 
WCST. In addition, Magaret, Grant, 
and Berg? have noted that feeble- 
minded children have been found to 
be capable of successful WCST per- 
formance. No deaf children were in- 
cluded in the studies just cited, how- 
ever. 


Discussion 


Age. Differences in performance be- 
tween the eight-year and 12-year 
groups did not emerge in the statistical 
analyses, although a tendency for the 
expected better performance of older 
children was observed. This might sug- 
gest that test items may not have been 
sufficiently sensitive to detect differ- 
ence by age, but another interpreta- 
tion is available. Many of the older 
deaf and hearing subjects, upon test 
completion, questioned the  experi- 
menter as to the nature of the task, 
and the required solution. They indi- 
cated that they had been guided by 
hypotheses other than that required for 
the correct response. Weir and Steven- 
son (15, p. 148) also have noted this 


effect in a study of verbalization and- 


learning: ‘Older subjects developed 
more complex hypotheses concerning 
the solution of the problem and these 


*Magaret, A., Grant, D. A., and Berg, E. 
A., A study of the performance of endoge- 
nous feebleminded children on the Wiscon- 
sin Card Sorting Task. Unpublished ma- 
terial, quoted in Fey (4). 





hypotheses hindered the development 
of the more simple, correct solution.’ 


Schools for the Deaf. The results 
further indicate no significant differ- 
ences among deaf children from pri- 
vate, public, or parochial oral schools. 
The finding of no difference among 
schools, however, might have been al- 
tered had the deaf groups been chosen 
from oral, manual, combined, and sign- 
language classes for the deaf. Support 
for this statement is found in Hayes’ 
(7) interpretation of his replication of 
Myklebust and Brutten’s (10) study. 
Hayes suggests that the discrepancy 
between the findings of the two studies 
is attributable to the different educa- 
tional treatment that the deaf groups 
received. Hayes’ sample of deaf chil- 
dren, in which no inferiority of visual 
perception was noted, was a sample 
from an oral school for the deaf. 


Educational Implications. The find- 
ings of this study also suggest that 


* deaf children are capable of cognitive 


behavior. Where they have encoun- 
tered difficulty in this area may well 
be where the linguistic demands have 
been beyond their experience. One way 
of lessening this difficulty is, of course, 
to expose the deaf child to even more 
linguistic experience and to an even 
greater vocabulary. It appears, how- 
ever, that memorization of word lists 
and drills and exercises for the expan- 
sion of vocabulary are not sufficient. 
As Guilford (6, p. 368) says: 


Verbal comprehension is undoubtedly a 
very important trait in a verbal civilization, 
but its obvious role in education has often 
obscured the importance of other intellec- 
tual factors. A clearer knowledge of the 
other intellectual abilities should enable 





— —_— —— FF 


~_ _ a 


1, 


e 
e 





Rosenstein: Deaf Children’s Cognitive Abilities 119 


us to reappraise the current approaches 
to encouraging intellectual development 
through education. 


Summary 


The cognitive ability of 60 deaf and 
of 60 hearing children was examined 
with a test battery of a perceptual dis- 
crimination task, a modified Wisconsin 
Card Sorting Task, and a concept at- 
tainment and usage task, all presented 
visually and nonverbally. 

Results indicate that there is no sta- 
tistically significant difference between 
deaf and hearing children in their 
ability to perceive, abstract, or gen- 
eralize, nor are differences observed 
when these groups are compared with 
respect to age (8 years, 12 years); or 
with respect to type of school (paro- 
chial, public, private). The results are 
consistent with the hypothesis that no 
differences will be observed between 
deaf and hearing children in the tasks 
used here, where the language involved 
in these tasks is within the capacity 
of the deaf children. In those previous 
findings where conceptual deficit has 
been reported, it seems likely that the 
tasks have involved linguistic abilities 
beyond those of the deaf children 
tested. 


Acknowledgment 


The writer wishes to acknowledge 
the cooperation of the following prin- 
cipals and their staffs (in the St. Louis 
area): Sister Anna Rose, St. Joseph 
Institute for the Deaf; Dr. Helen S. 
Lane, Central Institute for the Deaf; 
Dr. Kenneth Mangan, Gallaudet Day 
School for the Deaf; Sister Evarista, 
St. Patrick’s School; Mr. Hillis Howie, 


The Community School; and Mr. 
Wayne Barnes, Stix Public School. 


References 


1. Berc, Esta A., A simple objective tech- 
nique for measuring exibility in think- 
ing. J. gen. Psychol., 39, 1948, 15-22. 

2. Buncu, M. E., and Hacman, E. P., The 
influence of electric shocks for errors in 
rational learning. J. exp. Psychol. 21, 
1937, 330-341. 

3. Epwarps, A. L., Experimental Design in 
Psychological Research. New York: Rine- 
hart, 1950. 

4. Fey, Exizasern T., The performance of 
young schizophrenics and young nor- 
mals on the Wisconsin Card Sorting 
Test. J. cons. Psychol., 15, 1951, 311- 
319, 

5. Grant, D. A:, and Berc, Esta A., A 
behavioral analysis of degree of rein- 
forcement and ease of shifting to new 
responses in a Weigl-type card-sorting 
problem. J. exp. Psychol., 38, 1948, 404- 
411. 

6. Guttrorp, J. P., Personality. New York: 
McGraw-Hill, 1959. 

7. Hayes, G., A study of the visual per- 
ception of orally trained deaf children. 
Master’s thesis, Univ. Massachusetts, 
1955. 

8. Meyer, Epitx, Psychological and emo- 
tional problems of deaf children. Amer. 
Ann. Deaf, 98, 1953, 472-477. 

9. Myxtesust, H. R., Towards a new uh- 
derstanding of the deaf child. Amer. 
Ann. Deaf, 98, 1953, 345-357. 

10. Myx.esust, H. R., and Brutren, M., A 
study of the visual perception of deaf 
children. Acta Otolaryng., Suppl. 105, 
1953. 

11. Oxéron, P., Conceptual thinking of the 
deaf. Amer. Ann. Deaf, 98, 1953, 304- 
310. 

12. Rosenstein, J., Perception and cognition 
in deaf and hearing children. Ph.D. dis- 
sertation, Washington Univ., St. Louis, 
1959. 

13. Vinacke, W. E., The Psychology of 
Thinking. New York: McGraw-Hill, 
1952. 

14. Warren, H. C., Dictionary of Psychol- 
ogy. Boston: Houghton Mifflin, 1934. 

15. Weir, M. W., and Srevenson, H. W., 
The effect of verbalization in children’s 
learning as a function of chronological 
age. Child Develpm., 30, 1959, 143-149. 





Electrodermal Responses of Deaf Children 


WILLIAM W. GRINGS 


EDGAR L. LOWELL 


RONOLD R. HONNARD 


In a recent paper on the role of con- 
ditioning in electrodermal response 
(EDR) audiometry with children (3) 
it was emphasized there are actually 
few published factual data on the na- 
ture or conditioning of electrodermal 
responses with young children (6, 7, 8). 
Research reports on EDR audiometry 
utilizing such a population have !argely 
emphasized technique and data such 
as number of successful cases rather 
than quantitative facts of individual 
response records (2). 

The shortage of empirical informa- 
tion led to the planning of two studies 
to accumulate basic data on a sample 
of cases from a hearing impaired pre- 
school age population. In Study One 
the plan is to concentrate on basic 
EDR characteristics such as average 





William W. Grings (Ph.D., University of 
Iowa, 1946) is Associate Professor of sy- 
chology, University of Southern California, 
and Research Associate, John Tracy Clinic, 
Los Angeles. Edgar L. Lowell (Ph.D., Har- 
vard, 1952) is Administrator, John Tracy 
Clinic. Ronold R. Honnard (B.A., Univer- 
sity of Southern California, 1956) is Research 
Assistant, John Tracy Clinic. This investiga- 
tion was supported in part by a research 

rant from the Bureau of Crippled Children 

rvices, Department of Public Health, State 
of California, and Children’s Bureau, U. S. 
= of Health, Education, and Wel- 
are. 


Volume 3, No. 2 120 


magnitude and variability of response 
to certain classes of adequate stimuli 
(visual, auditory, and electrotactual), 
repeat test reliability, and rate of spon- 
taneous responding. Study Two is to 
utilize the Study One children in a 
two-stage classical conditioning series 
with the EDR as the response and 
electrotactual stimulation as the un- 
conditioned stimulus. The learned, or 
conditioned, stimulus was a light on 
one of the two occasions and a pure 
tone from an audiometer on the other 
occasion. One of the purposes of this 


‘study was to compare conditionability 


of these children to the different classes 
of stimuli (visual and auditory). 

The present report, limited to the 
preliminary phase of this two-part pro- 
gram, is concerned with the nature 
of electrodermal activity for a special 
population of children (preschool age 
deaf and hard of hearing) when the 
children are placed in a standardized 
laboratory environment and are pre- 
sented certain stimuli. The incidence 
(regularity), magnitude, and variabil- 
ity of the EDRs were observed. All 
children were measured on two suc- 
cessive occasions (test-retest), these 
being referred to as Session I and 
Session II. 


June 1960 











—_— Ee 











Grings, Lowell, Honnard: EDR of Deaf Children 121 


Method 


Subjects. Subjects were 15 children, 
aged from two to six years, regularly 
enrolled in the John Tracy Clinic Dem- 
onstration Preschool, a school where 
one condition for admission is a severe 
auditory loss. 


General Procedure. Each child was 
observed on two different days for 
periods of approximately 45 minutes. 
On each day the programmed se- 
quence included (a) a three-minute 
rest period (for observing spontaneous 
activity), (b) somewhat over 30 min- 
utes of stimulation trials alternating 
with control trials, and (c) a second 
three-minute rest or spontaneous ac- 
tivity period. All time periods within 
the sequence were controlled auto- 
matically (independent of the experi- 
menter) and were identical on both 
days. 

The stimulation series involved four 
ascending tone sweeps, four presenta- 
tions of a light stimulus, and five 
presentations of an_ electrotactual 
stimulus. (In each case the initial elec- 
trotactual stimulation was for the 
purpose of setting an appropriate in- 
tensity level for subsequent use.) The 
individual presentations to one mo- 
dality were not given all at once but 
in varied orders. For example, the first 
tone sweep might be followed by the 
first light stimulus, or the first electric 
stimulus, or the second tone sweep, 
depending on a system of random 
permutations. Only six completely 
different sets of such orders were used, 
rather than a different order for each 
child. 


Control (nonstimulation, or pseudo- 
stimulation) trials were interspersed 
before and after each actual stimulus, 
and the record was automatically 
marked as if an actual stimulation had 
occurred. Intertrial intervals also were 
varied automatically among durations 
of 10, 20, and 30 sec by a system of 
random permutations. 


Both the initial and final rest, or 
spontaneous activity, periods were seg- 
mented to correspond to intervals be- 
tween the stimulus presentations on 
stimulation trials. This was done to 
make it possible to invoke response 
latency criteria in the scoring of error 
responses during the rest intervals. The 
automatic programmer was set to 
register pseudostimulation intervals on 
the record. As far as the child was 
concerned, however, no experimental 
stimulus was administered to him dur- 
ing these periods. 

Measurements were made in two ad- 
jacent, quiet, but not sound treated, 
rooms. Two experimenters were in- 
volved, one working with the child, 
the other operating the equipment in 
the second room. The child and the 
experimenter working with the child 
could not see the equipment or the 
second experimenter, although voice 
communication was possible through 
a panel between the rooms. The first 
experimenter kept the child busy with 
toys while the second controlled the 
timer, stimulators, and_ recorder. 
Throughout the session the child sat 
in a high chair wearing earphones, 
EDR electrodes on two fingertips, and 
electrostatic electrodes on one leg, as 
further detailed below. 








122 Journal of Speech and Hearing Research 


EDR Apparatus. The EDR was ob- 
tained as a DC resistance through 
¥,” x %” rectangular silver electrodes, 
bent to the contour of the finger, 
coated with electrode paste, and taped 
to the first and third finger of the 
child’s left hand. The electrodes led 
to a modified Darrow-type bridge 
from which accurate readings of initial 
resistances as well as resistance changes 
could be determined. The output of 
the bridge was amplified and recorded 
by a DC channel on an Offner Type 
T electroencephalograph. 


Tone Stimuli. The tone stimuli, pro- 
duced by a Beltone Model 15A audi- 
ometer, were delivered to the left ear 
through earphones. Each tone was 500 
cps for all subjects and was presented 
for 5 sec. The tone sweeps started at 
20 db re audiometer zero and contin- 
ued upward in serial steps of 10 db 
until two consecutive positive responses 
occurred or until 100 db was reached, 
whichever occurred first. Each child 
received four such sweeps during the 
course of Session I. In Session II 
(retest) the sweeps were exactly the 
same as they were in Session I. The 
criteria for a positive response to tone 
were (a) an amplitude of pen excur- 
sion representing at least 100 ohms re- 
sistance change, and (b) a latency of 
more than 1 sec and less than 5 sec. 
Intertrial intervals between tones with- 
in a sweep were varied, as were inter- 
vals between all adjacent stimuli (10 
sec, 20 sec, or 30 sec, randomly 
varied). 


Light Stimuli. The visual stimulus 
was a light box, two feet square, with 
incandescent bulbs behind a translucent 
screen. The intensity of the translucent 


surface could be varied by changing 
the voltage applied to the lights. It 
was arbitrarily decided to utilize three 
intensity levels, attempting first to 
determine a level just sufficient to 
produce a reaction, then to main- 
tain for remaining stimulations the 
level just below that sufficient to pro- 
duce a response. The first brightness 
level presented was 34 ft-L; the lowest 
level was 7 ft-L; and the highest level 
was 100 ft-L. Each light duration was 
5 sec, and the criteria for the occur- 
rence of a response were the same as 
for the other stimuli. 


Electrotactual Stimuli. The electro- 
tactual stimulus was a 60-cycle AC 
current obtained from a Grason-Stad- 
ler psychogalvanometer, Model E664, 
and was delivered through half-inch 
zinc electrodes taped two inches apart 
on the calf of the child’s left leg. Elec- 
trode paste was applied to the elec- 
trodes before they were attached; a 
reading of electrode resistance was 
made at the beginning and end of each 
session. Three intensity levels (.75 ma, 
1.2 ma, and 1.6 ma) were determined 
from the 25th, 50th, and 75th percen- 
tiles of frequency distributions of re- 
sponses to similar stimuli obtained dur- 
ing previous conditioning studies in 
the same laboratory with similar chil- 
dren. The middle intensity level was 
administered to each child first. If the 
child gave an EDR and did not express 
discomfort, the same level was used for 
subsequent stimulations. If the child 
expressed discomfort, the intensity was 
reduced to the lowest level for the 
remaining trials. If the child did not 
respond to the middle intensity, the 
third or highest level was employed 








54, 
ch 
art 
C- 
C= 


vas 
ich 
na, 


1ed 


hil- 
was 
the 
ress 
for 
hild 
was 
the 
not 
the 
yed 











Grings, Lowell, Honnard: EDR of Deaf Children 123 


on subsequent stimulations. The dura- 
tion of the electrotactual stimulus was 
.6 sec. 


Stimulus durations were controlled 
electronically through the sequence of 
timers incorporated in the Grason- 
Stadler instrument previously referred 
to. Sequences of stimuli and intertrial 
intervals were controlled by means of 
prepunched tape in a Gerbrands syn- 
chronous motor programmer-timer. 


Results and Discussion 


Since this study does not center 
about the test of a specific experimental 
hypothesis, the presentation of data 
will be organized into sections dealing 
with various aspects of the general 
problem of eliciting EDRs from chil- 
dren: spontaneous response rates and 
rate of response on control trials; re- 
sponses to various classes of simple 
stimuli; and repeat measurement con- 
sistency. 

In all of the above areas objective 
criteria were applied in scoring re- 
sponses. Where frequency of response 
was a variable, a criterion for the oc- 
currence of a response was required. 
This involved latency characteristics 
(more than 1 sec and less than 5 sec) 
and amplitude requirements (a mini- 
mum pen excursion of 1 mm, or ap- 
proximately 100 ohms change). For 
comparison of magnitude of response 
the maximum change (within the la- 
tency limits) was expressed in the 
unit, square root of conductance 
change, obtained by subtracting the 
reciprocal of the resistance prior to 
stimulation from the reciprocal of the 
resistance following stimulation and 
taking the square root of the differ- 


ence. This unit had been found in 
previous studies (4) to yield greater 
symmetry and homogeneity of vari- 
ance than do simple resistance or con- 
ductance measurements. 

Spontaneous Responding. The cen- 
tral issue in observing EDRs during 
periods when no stimulus was pre- 
sented to the child involved establish- 
ing some base rate for error responses, 
the responses occurring during audi- 


‘ometric sessions which are not really 


due to the presence of the test tone. 
Such information was available from 
the three-minute spontaneous activity 
periods before and after stimulation, 
and from the control trials presented 
before and after each _ stimulation 
trial. 

The estimate of error-responding 
during the initial rest period was ob- 
tained from the automatically pro- 
grammed interval marks on the graphic 
response records which provided a 
time point for applying latency cri- 
teria to pseudostimulation trials. If a 
response of over 100 ohms occurred 
during the interval from 1 sec to 5 sec 
after the reference mark, it was la- 
belled one error response. Since there 
were 10 such control ‘trials’ during 
the rest period, a child could give any- 
where from zero to 10 error responses 
and this could be expressed as a per- 
centage (2 out of 10 being 20%, 3 
out of 10 being 30% and so on). Simi- 
lar percentages were determined for 
each child from the last rest period, 
and a third percentage of error-re- 
sponding was obtained from the con- 
trol trials interspersed during the 
stimulation series. 

When distributions of these individ- 
ual percentages of error-responding 





124 Journal of Speech and Hearing Research 


Taste 1. The percentages of responses occurring on control trials during period before stimulation, 


during stimulation, and after stimulation. 











Response Session I Stimulation Sesston II Stimulation 

Before During After Before During After 

Median 21.4 22.8 27.5 18.9 22.0 28.7 
Seventy-fifth Percentile 27.6 39.0 35.7 23.6 32.6 35.7 
Twenty-fifth Percentile 16.7 8.2 13.9 13.2 10.0 21.4 








were made separately for the three 
periods in each of the two sessions, 
the results in Table 1 were obtained. 
There it will be noted that the medians 
of these distributions of error responses 
increase during the period of stimula- 
tion on both sessions. The individual 
variability of responding also appears 
to vary with experimental stimulation 
conditions. One method for evaluat- 
ing this is to note the range of values 
from the 75th percentile to the 25th 
percentile in the distribution of re- 
sponses, the figures for which are given 
in Table 1. In general, when the child 


is being exposed to auditory, visual, | 


and electrotactual stimulation condi- 
tions, his error-responding appears to 
increase over what it was before such 
stimulation. 

It will be noted that the’ median 
values of such error-responding before 
stimulation are somewhat above the 
commonly quoted figure of 20% 
(10). The percentage obtained from 
control trials during stimulation is 
still larger. Since in both cases the 
absolute percentage is also a function 
of the stringency of latency and am- 
plitude criteria, a comparison was 
made using the criterion recommend- 
ed by Hind, Aronson, and Irwin (5). 
This included only responses with 
latencies from 1.4 to 2.9 sec. The more 


stringent latency criterion reduced the 
percentage of error-responding but 
did not change the direction of effect 
of stimulation condition on error; that 
is, the percentage was higher during 
and after stimulation than it was be- 
fore stimulation. 


Responses to Stimuli. The series of 
measurements involving light, tone, 
and electrotactual presentations had 
several purposes: (a) to provide esti- 
mates for each: child of the intensity 
of stimulation just adequate to elicit 
an EDR, information which could 
then be used in subsequent studies of 


; 


SO ER Re eet ee Oe tee an 


$$ 


: 





eal Ekctric I iy er 


Percent of S's Responding 








40 
~~ Electric I 

soe __. Light I 
---. Tone I 

+ tek 

be cee Light I 

oe a A. 1 i 
] 2 3 4 

Trials 


Ficure 1. Percentage of subjects responding 
to the three classes of stimuli on four suc- 
cessive presentations during Sessions I and II. 











Oe. a eae ee 


a= ~ 


"ewer “MF 


— 





Grings, Lowell, Honnard: EDR of Deaf Children 125 


TasLe 2. Magnitude of response to tone, light, and ‘electrotactual stimuli in Sessions I and II, in 
square root of conductance change. 











Response Session I Session IT 
Tone Light Elect. Tone Light Elect. 
Median 6 8 11 5 6 1.5 
Seventy-fifth Percentile 8 BI 1.5 Dy PY i 3.7 
Twenty-fifth Percentile 2 4 9 2 5 1.3 








conditioning; (b) to note the effect .of 
repeated stimulation, that is, factors 
like response adaptation; (c) to pro- 
vide estimates of the consistency of 
response from session to session; (d) 
to test for possible changes in error- 
responding, since, as was seen earlier, 
the error response on control trials 
fluctuates with other situational con- 
ditions, (€) to provide data on re- 
sponse latency. 


$0, Tectual 
Sth [| 


40 Session One 


f 3 





Light 


The intensity of the stimuli was 
evaluated with a very stringent cri- 
terion (no error responses on adjacent 
control trials). Over half the group 
responded to the lowest electrotactual 
intensity (.8 ma); over half responded 
to either the middle or the minimum 
light intensities (35 ft-L or less); and 
11 of the 15 children responded with 
EDRs only to the 100-db level tone. 

The temporal course of responding 


Tone 


ie JL 











_ ee ae .- 4 e tj 2 3.4 2 
60 vr SXconds — 
50 
40 Session Two 
f 20} 


20+ 











a 54” ma 
vee Se ek 





O 123 +4 0} 


23 0 | 





23 4: Fe 


Figure 2. Distributions of response latencies for three stimuli in Test Sessions I and II, 








126 Journal of Speech and Hearing Research 


is plotted in Figure 1 to permit com- 
parison of differences among stimuli. 
It will be seen that there is some 
evidence of response adaptation to tone 
and light. It will be remembered that 
no effort was made to equate the sub- 
jective intensity level of the three 
classes of stimuli, or the procedure 
for controlling intensity during their 
administration. With tones it was de- 
sired to cover a wide range of intensity. 
With shock it was desired to evaluate 
adaptation, and with light it was de- 
sired to find an intensity just below 
that adequate to elicit a response. These 
different purposes must be kept in 
mind in interpreting the results. 

Related items of interest are the 
average amplitude of response to each 
stimulus class and the amount of change 
in amplitude (adaptation) which oc- 
curred between the first and second 
sessions, Table 2. Differences in mag- 
nitude of response were found among 
the stimuli as well as some evidence of 
change from session to session. For 
tone and light the magnitude decreased, 
whereas for the electrotactual stimulus 
the response magnitude showed a 
slight increase. 

Since Aronson, Hind, and Irwin 
(1) had shown that EDR response 
latency distributions vary with inten- 
sity of tone stimuli with normal hear- 
ing college students, it seemed likely 
that response latencies with deaf chil- 
dren would differ with mode of stimu- 
lation. To check this, distributions 
were plotted as shown in Figure 2. 
Several results became apparent: (a) 
very few responses have latencies less 
than one second; (b) most responses 
have latencies between one and three 
seconds; (c) the distributions of la- 


tencies differ with the mode of stimu- 
lation; (d) the distributions remain 
consistent within stimuli from day to 
day; (e) latencies to tones show 
greatest variability for the three stimu- 
lus categories. In observing Figure 2 it 
is well to keep in mind that absolute 
numbers of responses to the stimuli 
vary. While the number of stimula- 
tions per stimulus class is constant, 
the electrotactual stimulus elicited 
more EDRs than did the tone. The re- 
sults above support the feasibility of 
using with children the latency rec- 
ommendation of Hind, Aronson, and 
Irwin (5) based on observations of 
adults. 


Repeat Measurement Consistency. 
One of the most important questions 
posed in evaluating work with the 
EDR concerns the reliability of the 
response process itself. Will the child 
react the same way on repeated oc- 
casions? Or will situational determiners 


_of the moment contribute so much 


variability that valid predictions be- 
come impossible? 


It is relevant to point out at the 
outset that skin electricity change 
measures have a bad reputation in this 
regard. They are variously described 
as unreliable, highly variable, and un- 
predictable. It is consistently empha- 
sized that one can never take reliabil- 
ity of EDR data for granted. Skin 
electricity changes are complexly de- 
termined and for that reason measure- 
ment characteristics of EDRs should 
be checked at every turn, especially 
when new procedures, instruments, or 
population of subjects are involved. 
As yet no one, to the knowledge of the 
writers, has carefully reported relia- 

















Grings, Lowell, Honnard: EDR of Deaf Children 


127 


TaB.e 3. Rank difference correlations between percentage of error-responding for different periods 


during repeat experimental sessions. 








Session I Stimulation 
Before During After 


Session IIT Stimulation 
Before During After 





Session I Stimulation 
Before .28 
During 
After 


Session II Stimulation 
Before 
During 
After 


.48* 22 31 -.09 
.70* -.06 15 -.06 
23 00 -.20 
35 .45* 

33 








*Significant at the 5% level. 


bility data on EDR behavior with very 
young, hearing-impaired children. 
One question of response consist- 
ency involves error responding as pre- 
sented in Table 1. The question was 
» asked whether children tend to give 
similar amounts of error on different 
occasions. A simple way to note con- 
sistency of spontaneous responding on 
an individual subject level is to rank 
the children in terms of their rate 
of responding on contro] trials (highest 
to lowest percent of error responses). 
If this is done on different occasions 
and rank difference correlation coeffi- 
cients computed, estimates of consist- 
ency can be made. A small matrix of 
such correlations is presented in Table 
3. It will be noted that the correlations 
within sessions are moderately positive 
—the child giving many error responses 
continues to do so throughout a: given 
session. Correlations between sessions 
are lower and show more variability. 
The only statistically significant corre- 
lations occur within the same sessions. 


When the magnitudes of response to 


specific stimuli are evaluated, other 
tests of EDR consistency can be made. 
It is customary with bioelectric phe- 
nomena to differentiate between two 
classes of reliability or consistency 
measures. One is described as ‘within- 
session’ reliability and reflects the 
moment-to-moment variability of sub- 
jects during a single observation period. 
This value for the EDR with a college 
student population is quite high—pub- 
lished studies (9) report reliability 
correlations of about .90. 

Within-session coefficients, however, 
do not reflect such factors as taking 
off electrodes and putting them on 
again, as well as a multitude of day- 
to-day changes in the subject and his 
environment which occur if estimates 
of consistency are made between ex- 
perimental sessions on different days 
(described as ‘between-session’ relia- 
bility). Most studies make measure- 
ments in one session where the elec- 
trodes are not taken off and diurnal 
variation is minimal. 

The present study makes possible 





128 Journal of Speech and Hearing Research 


Tasie 4. Rank difference correlations between 
magnitude of response elicited in Sessions I and 
II for the first stimulation, last stimulation, and 
average of four stimulations for tone, light, and 
electrotactual stimuli. 











Stimuli First Last Average 
Stimulation Stimulation 
Tone .23 .25 .49* 
Light .36 .39 .36 
Electric 23 .47* .51* 








*Significant at the 5% level. 


the comparison of both reliability esti- 
mates as they apply to this particular 
laboratory and sample of children. 
Table 4 shows correlations (rho) be- 
tween magnitude of response to the 
first stimulus, last stimulus, and aver- 
age of four stimulations for each 
stimulus class on the two sessions. 
As had been predicted, these be- 
tween-sessions correlations are of only 
moderate magnitude. Those to light 
had been adversely affected by experi- 
mental conditions which sought to 


maintain stimulus intensity at the point 


of minimum response. All are influ- 
enced by numbers of possible day-to- 
day variations. 


Table 5 presents comparable data 


Taste 5. Rank difference correlations between 

magnitudes of response within sessions, average 

of response to stimulations 1 and 3 correlated 

with the average of response to stimulations 2 

ps Fe Sy each stimulus category for Sessions 
and II. 

















Stimulus Session I S IIT Combined 
Tone 44 .63* .56* 
Light .80* .85* .87* 
Electric .76* .80* .81* 








“Significant at the 5% level, 


obtained within sessions. In this case, 
the average magnitude of response to 
the first and third presentations of each 
stimulus has been correlated with the 
average response to the second and 
fourth presentation of each stimulus for 
each session. All values are much 
higher than those obtained: between 
sessions and all but one are statistically 
significant. The responses to light and 
electric stimuli are more consistent 
than responses to tone stimuli, as would 
be expected with this particular popu- 
lation of subjects. 


Summary 


Fifteen severely deaf preschool-aged 
children were administered controlled 
series of light, tone, and electrotactual 
stimuli on two different occasions in 
order to obtain normative data on the 
electrodermal behavior of these chil- 
dren. The rate of spontaneous re- 
sponding was evaluated by control 
trials during rest periods and between 
periods of stimulation. This error re- 
sponding was compared from one ses- 
sion to another. 


The percentage of subjects respond- 
ing on a given trial as well as their 
response magnitudes and latencies were 
compared across sessions and types of 
stimulation. The response magnitudes 
were correlated to achieve estimates of 
reliability. Two types of reliability were 
estimated: within-session reliability re- 
flecting moment-to-moment variability 
of the subjects during a single observa- 
tion period; and, between-session re- 
liability reflecting the multitude of 
day-to-day changes in the subject and 
such equipment variations as changing 
electrodes, 

















| 





- 


Se ow Oo 


eo ———_—___—— 








Grings, Lowell, Honnard: EDR of Deaf Children 129 


Acknowledgment 


The authors wish to acknowledge 
the assistance of Miss Eloise Jones in 
carrying out these studies. 


References 


1, Aronson, A. E., Hino, J. E., and Irwin, 
J. V., GSR auditory threshold mecha- 
nisms: effect of tonal intensity on ampli- 
tude and latency under two tone-shock 
intervals. J. Speech Hearing Res., 1, 1958, 
211-219. 

2. Goxpstein, R., Lupwic, H., and Naun- 
Ton, R. F., Difficulty in conditioning 
galvanic skin responses: its possible sig- 
nificance in clinical audiometry. Acta 
Otolaryng., 44, 1954, 59-77. 

3. Grines, W. W., Lowe. E. L., and 
Rusurorp, Georgina M., Role of con- 
ditioning in GSR audiometry with chil- 
dren. J. Speech Hearing Dis., 24, 1959, 
380-390. 

4. Grincs, W. W., O’Donnewt, D. E., 
Magnitude of response to compounds of 


te 


discriminated stimuli. J. exp. Psycbhol., 
52, 1956, 354-359. 


. Hino, J. E., Aronson, A. E., and Irwin, 


J. V., GSR auditory threshold mecha- 
nisms: instrumentation, spontaneous re- 
sponse and threshold definition. J. Speech 
Hearing Res., 1, 1958, 220-226. 


. Jones, H. E., Conditioned psychogalvanic 


responses in infants. Psychol. Bull., 25, 
1928, 183-184. 

Jones, H. E., The retention of condi- 
tioned emotional reactions in infancy. 


__J. genet. Psychol., 37, 1930, 485-498. 
8. Jones, H. E., The study of patterns of 


10. 


emotional expression. In M. L. Reymert 
(Ed.), International Symposium on Feel- 
ings and Emotions, chap. 13. New York: 
McGraw-Hill, 1950. 


. Lauer, A. R., Reliability of the galvanic 


reflex. Amer. J. Psychol., 1929, 41, 263- 
270. 

Stewart, K. C., Some basic considera- 
tions in applying the GSR technique to 
the measurement of auditory sensitivity. 
]. Speech Hearing Dis., 19, 1954, 174- 
183. 





Auditory Discrimination Learning 


by Aphasic and Nonaphasic Children 


LILLIAN F. WILSON 


DONALD G. DOEHRING 


IRA J. HIRSH 


In their daily experience, teachers of 
aphasic children become acutely aware 
of limitations in their students, par- 
ticularly with respect to perception and 
associative learning. A general difficulty 
in comprehending and recalling sounds 
has been attributed by various writers 
to a deficiency in integrating sensory 
information, to an inability to use cer- 
tain types of symbolic structure, or to 
a specific weakness in the memory for 
speech sounds, words, and sentences 
(3). The present study grew out of 
such experiences of one of the authors 
in a period when she was attempting to 
teach a group of aphasic children to dis- 
tinguish among and identify sounds 





Lillian F. Wilson (M.S., Washington Uni- 
versity, 1959) is a teacher in the Houston 
Independent Public School District. Donald 
G. Doehring (Ph.D., Indiana University, 
1954) was Research Associate, Central Insti- 


tute for the Deaf, and is now with the 


Department of Surgery, Indiana University 
Medical Center. Ira J. Hirsh is Assistant Di- 
rector of Research, Central Institute for the 
Deaf. This paper is based upon research con- 


ducted in connection with Miss Wilson's © 


MS. thesis. The research was initiated with 
the support of a grant (B-240) and was con- 
tinued under a grant (B-1710) to Central 
Institute for the Deaf from the National 
Institute of Neurological Diseases and Blind- 
ness of the National Institutes of Health. 


Volume 3, No. 2 130 


from different noisemakers. (‘Aphasic 
children’ are children who have not 
learned to use or comprehend language, 
although language development would 
have been expected on the basis of their 
intelligence, hearing level, and contact 
with the physical and social environ- 
ment.) All of the children had diffi- 
culty in discriminating the sounds of a 
drum, a xylophone, two ‘crickets,’ two 
horns, a whistle, and two bells. All of 
the sounds were audible, but discrimi- 


‘nation among them appeared to be 


difficult. There was special difficulty 
among sounds that seemed to have cer- 
tain acoustic features in common. For 
example, there was confusion between 
horns and whistles, and there was also 
confusion among the drums, xylophone, 
and crickets. The first group consisted 
of long, continuous sounds, while the 
second involved brief, transient sounds. 
These classroom observations led to a 
consideration of the hypothesis that 
certain kinds of perceptual discrimina- 
tion are characteristically difficult for 
aphasic children. In order to subject 
this hypothesis to experimental test, the 
number of stimulus dimensions was re- 
duced to two and the problem set under 
experimental control. 


June 1960 














Wilson, Doehring, Hirsh: Learning by Aphasic Children 


Briefly, the present study attempts 
to assess the ability of aphasic and non- 
aphasic children to discriminate among 
four sounds that differ with respect to 
two acoustic dimensions and to associ- 
ate each of these sounds with a differ- 
ent visual stimulus. 


Procedure 


Two values within each of two 
acoustic dimensions, duration and 
quality, yielded the following four 
sounds: short tone, long tone, short 
noise, and long noise. The children were 
asked to distinguish among these four 
sounds by learning to associate each 
with one of four randomly selected 
letters of the alphabet. Letters, rather 
than simple visual stimuli like squares 
or circles, were chosen because it was 
assumed that with these all of the chil- 
dren would have had similar experience. 
Such nonverbal stimuli as simple geo- 
metrical figures were not used because 
it was assumed that the nonaphasic chil- 
dren would have a greater capacity 
than the aphasic children for associat- 
ing verbal labels with ‘meaningless’ 
visual stimuli. 

Subjects. From the population of 
aphasic children enrolled in the speech 
department of Central Institute for the 
Deaf (CID) 14 were selected as sub- 


131 


jects. These 14 children were con- 
sidered to be primarily sensory aphasic, 
according to the criteria given by Mc- 
Ginnis, Kleffner, and Goldstein (2), 
and all but two of them also showed 
some hearing loss. The nonaphasic, or 
control group, was selected from an- 
other clinical population, namely the 
lipreading and auditory training classes 
at CID, so that comparable amounts of 
hearing loss would be present, without 
other communicative disorders. The 
nonaphasic group included 10 such chil- 
dren, plus three with normal hearing 
drawn from the speech clinic at CID. 
There were eight girls and six boys 
among the 14 aphasics, and nine girls 
and four boys among the 13 nonapha- 
sics. The two groups may be compared 
with respect to mean and range of age, 
the IQ, and hearing loss by reference to 
Table 1. IQ scores are based upon the 
Advanced Performance Scale (1) for 
both groups. Measures of hearing loss 
represent the average of losses at 500, 
1000, and 2000 cps. Although the 
groups appear to be fairly well 
matched, the nonaphasic children tend 
to be a little older, to have slightly 
higher IQs, and to exhibit less hearing 
loss than the aphasic children. 


Apparatus. The two auditory stimuli 
consisted of a complex tone with a 


TasLe 1. Age, IQ, and hearing loss of aphasic (experimental) and nonaphasic (control) groups. 











Group N Age IQ Hearing Loss (db) 

Mean Range Mean Range Mean Range 
Aphasic 14 8.1 7-11 114 93-146 56 9-80 
Nonaphasic 13 8.5 7-11 120* 103-146 34 0-63 








* Based on the IQ scores of only 10 children; three of the children had not been given an IQ test. 








132 Journal of Speech and Hearing Research 


fundamental frequency of 225 cps, gen- 
erated by a simple oscillator, and a 
white noise, produced by a random- 
noise generator. Either tone or noise 
was presented by depressing one of two 
push buttons. The duration of either 
sound was determined by the length of 
time that the button was held down. 
With practice, the experimenter con- 
sistently produced a short duration of 
about 0.3 sec and a long duration of 
about 0.8 sec. Either signal was fed 
through a power amplifier, an attenu- 
ator, and a matching transformer to a 
single Permoflux (PDR-10) earphone. 
This live earphone was mounted in a 
sponge-rubber cushion held in a spring 
headband, with a dummy earphone 
mounted in the same type of cushion 
on the opposite side. 


Method. Each subject was seated in 
a chair facing a blackboard. The head- 
set was placed on the subject’s head 
with the live earphone over the better 
ear. For those subjects who wore hear- 
ing aids, the hearing-aid receiver was 
removed and the live earphone was 
placed over the ear ordinarily used for 
the hearing aid. The experimenter was 
seated so that she could see the sub- 
ject’s face but the subject could not 
see her manipulating the push buttons. 


The subject was instructed verbally 
and in pantomime that this was to be 
a game. The language impairment of 
the aphasic children required that the 
experimenter communicate with them 
primarily with gestures. 

Before the testing started, the experi- 
menter made certain that each subject 
could hear both sounds. Throughout 
the experiment, sounds were intro- 
duced at a level of 110 db SPL for 
children with hearing loss, and at a 


level of 80 db SPL for children with 
normal sensitivity. Two of the hard- 
of-hearing, nonaphasic children com- 
plained of too much loudness and for 
them the level was reduced to 90 db. 
Also, one of the aphasic children with 
normal sensitivity complained that the 
sounds were too soft, and for him the 
level was raised to 90 db. 

The visual stimuli were the letters 
M, F, B, and O. The connections be- 
tween these letters and the four audito- 
ry stimuli were randomly different for 
each subject. In a pretraining session, 
the letters were presented individually 
on 3.5”x4” cards. A printed capital 
letter appeared on one side for the non- 
aphasic children, and a lower-case let- 
ter in cursive script on the other side 
for the aphasic children. (Cursive script 
is used in teaching the aphasic children 
at CID while block letters have com- 
parable place in the school experience 
of the nonaphasics.) In this pretraining 
session, the experimenter twice demon- 
strated the connection between a letter 


- and its associated auditory stimulus. 


(C] NONAPHASIC (NeI3) 
[7] APHASIC (N=14) 


& ara 



















NUMBER OF SUBJECTS 








66 
To To 
20 35 50 65 80 


TRIALS TO CRITERION 


Ficure 1. Distribution of number of trials 
required to achieve criterion of learning. The 
six aphasic children who failed to learn the 
task in 80 trials were arbitrarily assigned a 
value of ‘more than 80’ trials-to-criterion. 





— |r CD 


' Ve oD 


~_ 


7. a 





Wilson, Doehring, Hirsh: Learning by Aphasic Children 133 


She placed a card with a letter on the 
ledge of the blackboard, said ‘Listen,’ 
gave the appropriate sound, said ‘Point,’ 
and then pointed to the card. This pro- 
cedure was repeated, but this second 
time the experimenter waited for the 
subject to point. If the subject failed 
to point, the experimenter said ‘Point’ 
again, and guided the subject’s finger 
to the card. 

The pretraining period was followed 
by the experiment proper which con- 
sisted of a training period during which 
the experimenter presented the audito- 
ry stimuli and the subject was required 
to respond by pointing to the correct 


letter on a 7” x 8” card that contained 
all four letters. Each card was divided 
into four equal sections that contained 
the letters in block capitals on one side 
and in lower-case cursive script on the 
other side. There were 24 of these 
cards, containing all possible permuta- 
tions of the positions of the letters. The 
cards were presented to each subject 
in a different random order, and the 
series of 24 cards was repeated if more 
than 24 trials were required to reach 
the predetermined number of consecu- 
tive correct responses which would 
complete a training series for one sub- 
ject. This predetermined criterion of 


TaBLE 2. Distribution of incorrect responses made to each stimulus. (For both stimulus and response, 
LT = long tone; ST = short tone; LN = long noise; SN = short noise.) Numbers in parenthese 
represent percentage of total errors for the group. 











Response Stimulus 
LT ST LN SN Total 
Ss area LT 58 18 26 102 
(14.5) (4.5) (6.5) (25.6) 
ST 4 22 44 107 
(10.3) (5.5) (11.0) (26.8) 
LN 17 16 50 83 
(4.3) (4.0) (12.5) (20.8) 
SN 21 40 46 107 
(5.3) (10.0) (11.5) (26.8) 
Total 79 114 86 120 399 
(19.8) (28.6) (21.6) (30.1) 
sa ape LT 20 3 13 36 
(14.4) (2.2) (9.4) (25.9) 
ST 28 3 7 38 
(20.1) (2.2) (5.0) (27.3) 
LN 9 8 20 37 
(6.5) (6.8) (14.4) (26.6) 
SN 3 8 17 28 
(2.2) (6.8) (12.2) (20.1) 
Total 40 36 | 23 40 139 
(28.8) (25.9) (16.5) (28.8) 














134 Journal of Speech and Hearing Research 


learning was six consecutive correct 
responses. The criterion measure was 
thus the number of trials required to 
achieve the criterion of learning. If this 
criterion of learning had not been 
achieved after 80 trials, the task was 
discontinued. 


On all trials the experimenter posi- 
tively reinforced a correct response by 
nodding her head and smiling and neg- 
atively reinforced incorrect responses 
by shaking her head and frowning. If 
the response was incorrect, the subject 
was shown the correct response and the 
stimulus was repeated. The auditory 
stimuli were presented to each subject 
in a different random order, with the 
restriction that each stimulus be given 
once in every block of four trials. Dur- 
ing the testing period the subject was 
given an opportunity to rest every 15 
minutes. 


Results 


Trials-to-Criterion. The distribution 
of trials-to-criterion for each group 
presented in Figure 1 indicate that the 
performance of the aphasic children as 
a group was much more variable than 
that of the nonaphasic children. Of the 
14 aphasic children, six failed to achieve 
the criterion of learning in 80 trials, 
while all of the 13 nonaphasic children 
achieved the criterion within 50 trials. 
The nonaphasic children, on the aver- 
age, required fewer trials-to-criterion 
than the aphasic children. However, 
the three fastest learners in the aphasic 
group reached the criterion in almost 
as few trials as the three fastest learners 
in the nonaphasic group. To make 
possible a test of the difference be- 
tween groups, a subject who did not 
Iearn the task was arbitrarily assigned 


a score of ‘more than 80’ trials-to- 
criterion. The difference between 
groups in number of learning trials was 
significant at the .01 level, as deter- 
mined by the Mann-Whitney Test (4). 


Analysis of Errors. Table 2 shows 
the distribution of incorrect responses 
made to each stimulus by each of the 
two groups. The four columns indicate 
the four different stimuli while the four 
rows indicate the kinds of responses 
made. The numbers in parentheses rep- 
resent -percentage of total errors for 
the group. First of all, the main effect, 
the difference between the two groups, 
is illustrated here by a total of 399 
errors for the aphasic group and a total 
of 139 errors for the nonaphasic group. 
For neither group (see the totals and 
percentages at the right in Table 2) 
was there a striking response prefer- 
ence, that is, no particular response 
seemed to be given more often than 
others. The aphasic group, however, 
(see the totals and percentages along 
the bottom in Table 2) generally made 
more errors for short stimuli, whether 
tone or noise, than for the long stimuli. 
A similar tendency for the nonaphasic 
group is apparent only with respect to 
noise. Comparisons between the two 
groups may more easily be made by 
reference to the percentages of total 
errors for the group. It is clear that for 
the nonaphasic group, errors with re- 
spect to duration are more frequent 
than errors with respect to sound 
quality. In the aphasic group, on the 
other hand, this same tendency is more 
evident when the stimulus is long. 
When the stimulus was either a short 
tone or a short noise, there were more 
errors but these errors appear to be 
related about equally to confusions of 

















Wilson, Doehring, Hirsh: Learning by Apbasic Children 135 


both duration and quality. This last ob- 
servation suggests the possibility that 
if there is a difference between the two 
groups with respect to a particular kind 
of discriminative difficulty, it is that 
the aphasic children have more diffi- 
culty than the nonaphasic children in 
discriminating the quality of short 
sounds. 


None of these analytic observations 
was subjected to statistical analysis and 
none of them appears to be striking 
from a casual inspection of the table. 
The main finding, of course, is that the 
aphasic children as a group made many 
more errors than the nonaphasic group. 


Some Additional Observations 


Posttraining Period. The six aphasic 
children who failed to learn the task in 
80 trials were given further training. 
The associative task was made easier 
by showing a single card and repeating 
the auditory stimulus for the letter on 
that card several times. Furthermore, 
the child was asked to describe the 
sound before he pointed to the card. 
With single-card presentations of this 
kind, all of the children were able to 
form correct associations between two 
visual and two auditory stimuli and, 
after additional practice, between three 
visual and three auditory stimuli. A 
fourth card brought about more errors 
but these disappeared also with prac- 
tice. When one of the large cards, 
which contained all four letters, was 
substituted for the four single cards, 
only one child was successful. If the 
large composite card was presented in 
the presence of the four single cards, 
so that the child could use the single 
cards as a kind of reference, more suc- 
cess was observed. These nonsystematic 


observations suggested that the defi- 
ciency observed in the main experiment 
reported above probably had to do 
more with the complexity of the as- 
sociative process than with the per- 
ceptual discrimination of the four 
acoustic stimuli. 


Later Training. About a year after 
the original tests, two of the authors' 
attempted to determine whether or not 
the performance of the aphasic children 
had been impaired by a particular dif- 
ficulty in discriminating the quality of 
short sounds, as had been suggested by 
the analysis of errors shown in Table 2. 
Two aphasic children and two non- 
aphasic children, all with normal 
auditory sensitivity, were tested with 
respect to discrimination between a 
250-cps tone and a white noise, each 
as a function of duration. The duration 
of tone or noise bursts was controlled 
by a pulse generator and an electronic 
switch. All sounds were presented 
through a loudspeaker in a free field. 
All four of the children correctly dis- 
criminated tone from. noise on all trials, 
even when the stimulus duration was 
reduced to 0.02 sec and the intensity 
of the sounds was reduced to about 60 
db SPL. One of the two aphasic chil- 
dren had made only four errors on the 
original learning experiment, and there- 
fore her accuracy was not particularly 
surprising; but the other aphasic child 
was one of those who had failed to 
learn the original task within 80 trials. 
Since this latter child could discrim- 
inate tone from noise, even when the 
sounds were as short as 0.02 sec, his 
performance on the original learning 


*With the cooperation of Herbert N. 
Wright. 





136 Journal of Speech and Hearing Research 


task was probably not limited by his 
ability to distinguish the quality of 
short sounds. 

This same child, in further training, 
learned to say: ‘Long tone,’ ‘Short tone,’ 
‘Long noise,’ or ‘Short noise,’ each as 
appropriate responses to tones or noises 
of durations equal to 0.8 or 0.3 sec, the 
durations that had been used in the 
original experiment. This child could 
thus identify the sounds after only a 
single demonstration of each of the 
four connections. For him at least, the 
original failure was not related to a 
difficulty in discriminating among 
sounds. The original learning task was 
also repeated at this time, and despite 
the fact that he was a year older and 
had been given a great deal of addition- 
al training, he required 37 trials to 
reach the criterion. Thus his failure 
appears to be specific to an association 
between visually-presented letters and 
auditory stimuli. 


Discussion 


Of the 14 aphasic children who were 


used in the main experiment, eight of; 


them learned the task in less than 65_ 


trials while six of them did not learn 
it even after 80 trials. These two sub- 
groups of aphasic children are not sig- 
nificantly different in age, IQ, or 
hearing loss. It cannot be said, there- 
fore, that the failure to learn this par- 
ticular task is characteristic of the 
entire sample of aphasic children, nor 
is it possible to identify subcharacter- 
istics within the sample that appear to 
be important for differentiating those 
children who learned this task from 
those who did not. 


The additional observations that were 
made after the experiment itself was 


concluded can only suggest certain 
hypotheses to be proved or disproved 
by further systematic study. Two items 
of observation appear to be important. 
First, in the extra training that followed 
the experiment, it was found that chil- 
dren who were unable to learn the orig- 
inal task could associate each of the 
four sounds with its respective letter 
if the letters were presented separately. 
Thus, it appears that the confounding 
of the visual stimuli within a single dis- 
play is an important barrier to success 
for some of these children. Second, in 
the psychophysical observations made 
on two aphasic children, it appeared 
that both the quality and duration of 
similar sounds could be easily discrim- 
inated even though they could not be 
successfully associated with the visual 
stimuli. In general, these later observa- 
tions indicate that the experiment, 
which was planned to test a combina- 
tion of perceptual discrimination and 
learning, resulted in more errors and 
failure for aphasic children primarily 
because of the associative aspect of the 
task, and not because of any special 
difficulty related to auditory discrimi- 
nation. 

The general finding that the aphasic 
children made more errors than the 
children in the nonaphasic control 
group requires further study before it 
may be concluded that this difficulty 
is a concomitant of whatever condition 
underlies the aphasic symptoms that are 
used to classify these children. If it 
turns out that such associative tasks 
characteristically present difficulty for 
certain aphasic children, then the ne- 
cessity will remain to find out what 
subtypes of aphasic children would 
succeed in these learning tasks and what 











~ 








ee a a ae 


ev == OV 








Wilson, Doebring, Hirsh: Learning by Aphasic Children 137 


subtypes would have difficulty or 
would fail to learn such tasks. 


Summary 


The performance of a group of 14 
children classified as sensory aphasic 
was compared with that of a group of 
13 nonaphasic children‘on a task where 
the children were taught to associate 
four auditory stimuli with four visual- 
ly-presented letters of the alphabet. 
The auditory stimuli differed in both 
quality and duration, and consisted of 
a long tone, a short tone, a long noise, 
and a short noise. Over half of the 
aphasic children learned the task in 
about the same number of trials as the 
nonaphasic children. The remaining 
aphasic children, six in number, failed 
to learn the task within the allotted 
number of trials. This difference in 
learning ability within the aphasic 


group was unrelated to age, IQ, or 
amount of hearing loss. Informal ob- 
servations on further training of the 
children who had failed to learn the 
task indicated that they were able 
to make the required discriminations 
among auditory stimuli, and that their 
poor performance was the result of a 
specific difficulty in learning to associ- 
ate four_visual stimuli with four audi- 
tory stimuli. 


References 


1. Lane, Heten S., and ScHNeIDeR, JENNY- 
Louise L., A performance test for school- 
age deaf children. Amer. Ann. Deaf, 86, 
1941, 441-447. 

2. McGrnnis, Mitprep A., Kierrner, F. R., 
and Goupstein, R., Teaching aphasic 
children. Volta Rev., 58, 1956, 239-244. 

3. Monszes, Epona K., Aphasia in children, 
diagnosis and education. Volta Rev., 59, 
1957, 392-401; 414. 

4. Waker, HELEN M., and Lev, J., Statisti- 
cal Inference. New York: Holt, 1953. 





Visual Spatial Memory in Aphasic Children 


DONALD G. DOEHRING 


The differential diagnosis of congenital 
aphasia as applied to certain language 
disorders in children has been seriously 
questioned. The consensus in a discus- 
sion of the concept of congenital 
aphasia by experts from the disciplines 
of pediatrics, psychology, psychiatry, 
neurology, and speech and_ hearing 
(3) was that what initially appeared to 
be congenital aphasia was usually a 
secondary result of mental retardation, 
hearing loss, dysarthria, behavior dis- 
order, or environmental deficiency. 
Thus, although children who appear 
to be congenitally aphasic may exhibit 
a disturbance of language processes that 


is quite similar to the language disturb- - 


ance observed in adults with adventi- 
tious aphasia, these children may differ 
greatly from adult aphasics with respect 
to the organic and nonorganic con- 
comitants of their language disorder. 
Despite the uncertain status of the 
term aphasia as a description of one 
specific type of disability in children, 
many children with language disorders 





Donald G. Doehring (Ph.D., Indiana Uni- 
versity, 1954) is Assistant Professor of 
Medical Psychology, Department of Surgery, 
Indiana University School of Medicine. This 
study was conducted while he was Research 
Associate, Central Institute for the Deaf. The 
study, was supported by a grant (B-1718) 
from the National Institutes of Neurological 
Diseases and Blindness of the National Insti- 
tutes of Health. 


Volume 3, No. 2 138 


are being educated primarily as aphasics 
rather than as deaf, mentally retarded, 
emotionally disturbed, environmental- 
ly deprived, or dysarthric. The pres- 
ent experiment was concerned with 
an investigation of certain perceptual 
abilities in children who were classi- 
fied and trained as aphasics at Central 
Institute for the Deaf (CID). The 
rationale and the procedure used for 
classifying children as aphasic at CID 
have been described in some detail by 
Kleffner (7) and by McGinnis, Kleff- 
ner, and Goldstein (9). Very briefly, 
the classification of aphasia is applied 
to children who manifest a deficiency 
in language development, as assessed in 
an interview situation, that is more 
severe than would be expected on the 
basis of the child’s nonverbal intelli- 
gence, hearing level, motor functioning, 
and affective functioning. This classi- 
fication does not depend upon signifi- 
cant etiological information, positive 
neurologic signs, or abnormal EEG 
records. If a child shows sufficient im- 
provement in language development as 
a result of special instructional methods 
that have been developed for the train- 
ing of aphasic children (7, 9) his classi- 
fication as aphasic is supported. In cases 
where these methods do not result in 
the expected amount of improvement, 
the original classification is considered 
doubtful, and may be changed to an- 


June 1960 








ae ae 


1 we & 


EG gc 


Doehring: Visual Spatial Memory in Aphasic Children 139 


other classification such as deaf or men- 
tally retarded. 


Children who are classified as aphasic 
on the basis of the procedures described 
above may or may not have defects of 
functioning in addition to their de- 
ficiency in language development. A 
neurologic assessment of 69 children 
classified as aphasic and 114 children 
classified as deaf at CID (6) failed to 
reveal any clear-cut neurologic signs 
that could be used to differentiate the 
deaf children from the aphasic chil- 
dren. The writers (p. 475) did com- 
ment, however, that some of the 
aphasic children tended to exhibit an 
‘obtuseness’ that was characterized in 
part by ‘a frustrating incapacity . . 
to signal the visual awareness of a tar- 
get light.’ Bilger? found that with 
respect to general nonverbal ability, as 
measured by the Advanced Perform- 
ance Scale (1, 8), aphasic children 
trained at CID tended to exhibit a 
lower level of nonverbal ability than 
did deaf children trained at CID, where 
the test was administered before admis- 
sion to training or during the first year 
of training. 

The present experiment was an at- 
tempt to determine whether children 
classified as aphasic are defective in 
certain nonverbal abilities related to 
visual memory. Their performance was 
compared with that of deaf children 
and normal children on a test where 
memory for spatial location was deter- 
mined as a function of (a) duration of 
exposure of the visual stimulus, (b) 
delay of recall, and (c) interference 
with fixation of the visual field. The 


Personal communication, 1959, from Dr. 
Robert C. Bilger, University of Michigan. 


systematic investigation of both non- 
verbal and verbal abilities in children 
classified as aphasic should aid in the 
determination of whether congenital 
aphasia can be considered a primary 
defect of functioning in children. 


Procedure 


Experimental Task. The task that 
the subject was required to perform was 
extremely simple and could be under- 
stood with a minimum of instruction. 
He was asked to indicate the position 
of a spot of light that had been briefly 
flashed on a piece of paper. The 
importance of immediate recall for ac- 
curate localization of the visual stimu- 
lus was determined by requiring the 
subject to wait eight seconds before 
responding on certain trials. The im- 
portance of continuous fixation for 
accurate recall was determined by inter- 
fering with the subject’s visual fixation 
before he was allowed to respond on 
certain trials. The importance of stimu- 
lus duration upon the accuracy of 


.recall was determined by varying the 


duration of the spot of light from trial 
to trial. 


The complex factorial design that 
was used in this experiment may be 
summarized as follows: 


1. Between-subject variables 
(6 independent groups): 

Ages 7 years, 8 months 
to 9 years, 7 months 

10 Normal 

10 Deaf 

10 Aphasic 
Ages 9 years, 8 months 
to 11 years, 7 months 

10 Normal 

10 Deaf 

10 Aphasic 





140 Journal of Speech and Hearing Research 


2. Within-subject variables (16 conditions, with 
each subject given one trial under each condition) ; 


No Interference with Fixation 


Exposure No Delay 8-sec Delay 
of Response of Response 
0.15 sec 1 trial 1 trial 
0.30 sec 1 trial 1 trial 
0.90 sec 1 trial 1 trial 
3.30 sec 1 trial 1 trial 
Interference with Fixation 
Exposure No Delay 8-sec Delay 
of Response of Response 
0.15 sec 1 trial 1 trial 
0.30 sec 1 trial 1 trial 
0.90 sec 1 trial 1 trial 
3.30 sec 1 trial 1 trial 


Each of the three groups of subjects 
was subdivided into two age groups. 
Thus, the amount of change in per- 
formance as a function of age could be 
compared with the magnitude of differ- 
ences in performance among the three 
primary groups, and any differential 
changes of performance of these, groups 
as a function of age also could be de- 
termined. The effects of duration of 
exposure of the visual stimulus, delay 
of the subject’s response, and interfer- 
ence with the subject’s fixation prior to 
his response were investigated as with- 
in-subject variables. The four exposure 
durations, the two delay conditions, 
and the two interference conditions 
were presented in the form of a 
4-by-2-by-2 factorial design. Each sub- 
ject was given one trial with each of 
the 16 experimental conditions: four 
exposure durations for each of two 
conditions of delay and for each of two 
conditions of interference. 


The main purpose of this experiment 
was to determine whether there were 
differential changes in performance 


among the three primary groups 
(aphasic, deaf, normal) as a function 
of changes in exposure, delay, inter- 
ference, or any interactions among 
these variables. The occurrence of any 
such differential changes for the aphasic 
group as compared with the deaf group 
and the normal group would provide 
evidence that disturbances of language 
processes in children classified as apha- 
sic are accompanied by disturbances 
in certain basic aspects of visual per- 
ception. 


Subjects. A total of 60 children were 
tested, including 20 deaf children and 
20 aphasic children enrolled at CID, 
and 20 normal children from a public 
school in University City, Missouri. 
The aphasic group included nine chil- 
dren classified as sensory aphasic,? 
eight children classified as sensory apha- 
sic with hearing loss, one child clas- 
sified as sensory aphasic with emotional 
disturbance, one child classified as sen- 
sory aphasic and motor aphasic with 
hearing loss, and one child classified 
as sensory aphasic with hearing loss and 
emotional disturbance. The deaf chil- 
dren were used as a control group in 
addition to the normal control group 
because their school environment was 
almost identical with that of the apha- 
sic children, and also because one-half 
of the children in the aphasic group 
were classified as hearing-impaired. It 
was felt that a control for hearing im- 
pairment might be necessary, because 
previous studies have shown that deaf 
children differ from normal children 
on certain tests of visual memory 
(2, 11). 


*See references 7 and 9 for a discussion of 
the procedures used at CID for differentiating 
sensory aphasia and motor aphasia. 




















Doebring: Visual Spatial Memory in Aphasic Children 141 


Tasie 1. Mean age (in years and months) and the mean and standard deviation of IQ scores for each 
subgroup, where the normal, deaf, and aphasic groups have been subdivided with respect to age. The 
IQ of the normal group was based on the California Test of Mental Maturity (12); the IQ of the deaf 
group and the aphasic group was based on the Advanced Performance Scale (8). 











Age Range Group N Mean Age IQ 

Mean SD 

Normal 10 8:9 121.1 ain 

7:8 to 9:7 Deaf 10 8:9 121.8 17.9 
Aphasic 10 8:6 111.3 14.4 

Normal 10 10:6 115.9 11.0 

9:8 to 11:7 Deaf 10 10:7 122.8 11.7 
Aphasic 10 10:4 107.6 23.7 








Each of the three primary groups 
contained nine girls and 11 boys. Table 
1 shows the mean age and the mean 
and standard deviation of IQ scores for 
each subgroup, where each of the three 
primary groups has been divided into 
two age groups. It should be noted that 
the IQ scores of the deaf children and 
aphasic children are based upon the 
Advanced Performance Scale, and 
therefore cannot be directly compared 
with the IQ scores of the normal chil- 
dren, which are based upon the Cali- 
fornia Test of Mental Maturity. 

Apparatus. The visual stimulus was 
a spot of light 5 mm in diameter pro- 
jected on the 5%” x 5%” viewing sur- 
face of a Micronta-Vue slide viewer. 
The spot of light was produced by the 
shining of the 100-watt projection 
lamp through a small hole in a piece of 
paper that had been inserted in a slide 
holder for 35-mm film. The small spot 
of light that passed through this slide 
was magnified and then reflected from 
a mirror to the viewing surface. A card- 
board frame was attached to the edges 
of the ground glass viewing surface, 
and when a piece of white paper was 
inserted into the frame, the visual 
stimulus appeared as an orange spot on 


the 5.25” x 5.25” portion of the paper 
that was exposed in the frame. A differ- 
ent slide was used on each trial, with 
the location of the hole in the slide 
varied in such a way that the spot of 
light would appear somewhere within 
a 4” x 4” area in the central portion of 
the paper that covered the viewing sur- 
face. On the eight slides that were used 
for the practice trials, the location of 
the hole was varied systematically to 
cover all parts of this area. The slides 
for the test trials were made in such a 
way that on each of the 16 test trials, 
the spot of light appeared in a different 
one of the 16 one-inch squares that 
made up the 4” x 4” area. 

Subjects were tested in their respec- 
tive schools in rooms comparatively 
free from distractions. The ambient 
light level was approximately the same 
for the two rooms used for testing. 
The slide viewer was located on a table 
in front of the subject with the viewing 
surface at an angle of 45° from the hor- 
izontal, and the subject was positioned 
in such a way that his face was ap- 
proximately one foot from the viewing 
surface. Subjects who had glasses were 
required to wear them, but no attempt 
was made to screen subjects for vis- 





142 Journal of Speech and Hearing Research 


ual acuity, since the spot of light was 
large and bright enough that it could 
be easily seen even by children with 
defective vision. Duration of exposure 
of the visual stimulus was controlled by 
an interval timer, which turned on the 
projection lamp for an amount of time 
that was preset on the timer. Durations 
of 0.15, 0.30, 0.90, and 3.30 seconds 
were obtained in this way. 

Method. There were eight practice 
trials and 16 test trials. Instructions to 
the subject were very simple. On each 
trial a new piece of 5.5” x 8.5” paper 
was placed in the cardboard holder. 
The subject’s attention was directed to 
the paper just before the spot of light 
was turned on for each trial. On the 
first practice trial the experimenter 
turned on the spot of light and kept it 
on while he used a pencil to place an 
X over the spot of light, making sure 
that the crossbars of the X intersected 
exactly in the center of the spot of 
light. On the second and the third 
practice trials the experimenter turned 
on the spot of light and kept it on 
while he placed the pencil in the sub- 
ject’s preferred hand and directed the 
subject to make an X over the spot of 
light. All subjects were able to do ‘this 
with no difficulty. 

The fourth practice trial demon- 
strated the experimental conditions in 
which there was no delay and no inter- 
ference. The spot of light was turned: 
on for 3.30 seconds, and immediately 
after it had been turned off the experi- 
menter placed the pencil in the subject’s 
hand and indicated that he should make 
an X where the light had been. After 
the subject had done this, the experi- 
menter turned on the light again and 
kept it on while he drew a circle that 
enclosed the spot of light. This cor- 


rection procedure was used on all sub- 
sequent trials, It served to maintain the 
motivation of the subject, since it gave 
him information about the accuracy of 
his performance, and it also provided a 
record of the subject’s error. The fifth 
practice trial was identical with the 
fourth, except that the light was turned 
on for only 0.90 seconds. 

The sixth practice trial demonstrated 
the experimental conditions that in- 
volved a delay of response with no in- 
terference. The spot of light was 
turned on for 0.30 seconds and the ex- 
perimenter did not place the pencil in 
the subject’s hand until eight seconds 
after the termination of the visual stim- 
ulus. Ali subjects attempted to maintain 
their fixation on the place where the 
spot of light had been during the period 
of delay, and therefore it was not neces- 
sary for the experimenter to provide 
special instructions to this effect. 

The seventh practice trial demon- 
strated the experimental conditions that 
inyolved interference of fixation with 
no-delay of response. The spot of light 
was turned on for 0.15 seconds. Fol- 
lowing this, the experimenter placed 
an 8” x 12” piece of black paper in 
front of the subject’s eyes at the same 
time that he placed the pencil in the 
subject’s hand, and then the paper was 
immediately removed. The paper was 
covered with a large number of irregu- 
larly-spaced orange spots that were the 
same size as the visual stimulus. The 
interposition of this paper between the 
subject’s eyes and the viewing surface 
was intended to disrupt the subject’s 
fixation upon that portion of the view- 
ing surface where the dot had been 
projected. 

The eighth and last practice trial 
demonstrated the experimental condi- 


| 
| 
| 








n Coa 





Doebring: Visual Spatial Memory in Aphasic Children 143 


Taste 2. Summary of analysis of variance of the effects of age, delay, interference, and exposure time 
on visual spatial memory in groups of normal, deaf, and aphasic children. 











~— 


»——_—_—_—__—_———_- — 











Source df ms F 
Between Subjects................... 59 
OEE) Soars ba vce utces «ses. 2 1346.00 5.82* 
fg) Sa Rs 1 808.00 3.49f 
Ce Ast he Fe hee weed bbe 2 15.00 
BERGE DD) oie 52s kee royal dealin sire Bid 54 231.31 
Within Subjects.................... 900 
WOGMNAE UE so cio ese s Gee wetinsc cues 1 16550 .00 247 .27* 
MPEP Soe eee ork Cake eek wes 2 29.00 
Fe ancinceia cc Coie sai swessinids s0< 1 80.00 1.20 
PRS Rees Cos alerrate aks steihs 2 9.00 
SA aos hee hocks CE Boas hes 54 66.93 
Interference (I)................ 1 12877 .00 138.46* 
1 DiS CC anne ly Sa pee ar ee a ra 2 179.00 2.44 
1S Ly aR ea ey ae are 1 202.00 2:7 
ee SP Aes xa Sacks vee ons Ss 2 207 .50 2.83 
Lo |. NS ee ron in eee 54 73.28 
Exposure (E)...............0.. 3 82.00 1.16 
WI ee ioe tise: Cavestie-ais 6 33.67 
| hab ie SoeRete a RRP EI SCRE) 3 31.00 
1S GSE 2). aaa ai an ae 6 63 .67 
1 Ot) oil «gy ace Pee re 162 70.76 
Pre eee ee tas et ees cone 1 102.00 1.44 
Br Siche SiGe. Soe ian ees was ie 2 258.50 3.65f 
Be Men Se osc See ence ets 1 34.00 
LIES ib ca > av. Ree eee z 196.00 2.77 
10> <2 ob at beppita er aac ie aria 54 70.70 
Ro tgs See hw aly ers csig ated ajue 3 34.67 
LDS 2 eS a a 6 72.83 1.10 
WEBS cs oo Ceesegece cece 3 21.33 
MS ee Ge Ae ss tess ds 6 91.83 1.39 
DSO SB Ss Poe ooh wie 162 66.06 
UGS Oy | SAPS eh sara dr pe tear ae 3 114.67 1.79 
WPM Ge Sebi cds eaewiee es 6 63.50 
1S gab, CE et 3 2.33 
BS MR GX, Recs scas lowes sine 6 26.00 
1S. rear 162 64.22 
OTS Sot OST race eee aren 3 56.00 
DSH XA LG se ons vs ses 6 81.83 1.21 
DOES Sal So Ode 2a, Rn 3 32.00 
Dee le eG BAe ose e.. 6 54.17 
Diese Be So. cece 162 67.79 
SROUAL So a Sate covees wae Robe Seek bbs 959 








* Significant at the 1% level. 
t Significant at the 5% level. 





144 Journal of Speech and Hearing Research 


tions that involved both delay and in- 
terference. The spot of light was turned 
on for 0.30 seconds and then the black 
paper was held in front of the subject’s 
eyes for eight seconds before the pencil 
was placed in his hand. Thus, all four 
of the exposure durations and all four 
of the experimental conditions that in- 
volved delay and interference were 


3.5 


demonstrated to the subject during the 
practice trials. 

The 16 test trials were presented im- 
mediately following the eight practice 
trials. The 16 slides that were used for 
the test trials were presented in the 
same random order to all subjects. This 
meant that the location of the spot of 
light was changed from trial to trial in 





A 
3.0+ 6 











@ NORMAL A 
@ OEAF 


A APHASIC 
































é 
ost @ 


eos » 


4s 
g @ 





2.0 A 


s x 

=? @ DELAY jr & FERENCE [ 

io B 

rk = 

: 2.0F bd ' a& bd ! J 

x< NO DELAY NO INTER- INTER- 
DELAY (8 SEC.) - FERENCE FERENCE 

~— 35 

& EXPOSURE AGE 

© 30} | 7+ & ‘ 

“ A 


g A 
jt © 4 








1 








EXPOSURE (SEC.) 


i jl l : 
015 030 090 3.30 


1 
7:8 -9:7 9:B-I7 


AGE 


Ficure 1. Changes in the accuracy of visual spatial memory of the normal, deaf, and aphasic 
groups as a function of delayed response (upper left), interference with fixation (upper right), 
duration of exposure of the visual stimulus (lower left), and chronological age (lower right). 





v 


| 





Doebring: Visual Spatial Memory in Aphasic Children 145 


the same sequence for all subjects. The 
16 experimental conditions were pre- 
sented in five different random orders. 
The level of accuracy of the subjects’ 
performance did not appear to change 
throughout the experimental session. 
The intertrial interval on test trials was 
about 15 seconds and the total experi- 
mental session lasted about 20 minutes. 


Results 


An error was measured as the dis- 
tance in millimeters from the intersec- 
tion of the crossbars in the X made by 
the subject to the center of the circle 
that was drawn by the experimenter 
to cover the spot of light. The resulting 
distribution of errors was positively 
skewed, and therefore a square root 
transformation was employed to nor- 
malize the distribution of errors (5). 


The results of the analysis of variance 
of the experimental variables are sum- 
marized in Table 2. Significant effects 
were found for both of the main effects 
in the ‘between-subjects’ portion of the 
analysis, with the group effect signifi- 
cant at the 1% level and the age effect 
significant at the 5% level. The signifi- 
cant group effect resulted from the fact 
that the overall errors of the normal 
group were smaller than those of the 
deaf group, and the overall errors of 
the deaf group were smaller than those 
of the aphasic group. This trend can be 
seen in all four of the graphs that are 
shown in Figure 1. Further analysis by 
the Newman-Keuls Sequential Range 
Test (4) revealed that the differences 
in overall errors between the normal 
group and the aphasic group and be- 
tween the deaf group and the aphasic 
group were significant at the 5% level, 
but the difference between the deaf 


group and the normal group was not 
significant. The significant age effect 
resulted from the larger errors made by 
the young children as compared with 
the older children. This trend can be 
seen in the graph at the lower right of 
Figure 1. The same graph shows that 
the decrease in errors with age was 
about the same for all groups, resulting 
in a very small group-by-age interac- 
tion. 

The within-subjects portion of the 
analysis of variance summarized in 
Table 2 revealed that an extremely 
large proportion of the within-subject 
variance was attributable to the effects 
of delay and interference. Both of these 
effects were significant well beyond 
the 1% level. The F ratio for exposure, 
on the other hand, did not approach the 
5% level. Nor were there any signifi- 
cant interactions of any of these effects 
with groups. The effects of delay, in- 
terference, and exposure and the inter- 
actions of these effects with groups 
can be seen in Figure 1. The two 
graphs at the top of Figure 1 show that 
the relative magnitude of the delay ef- 
fect was almost identical with that of 
the interference effect, and that each of 
these variables produced an almost 
identical effect upon the three groups. 
The graph in the lower left of Figure 
1 shows that changes in the duration of 
exposure produced very little change in 
the performance of any of the groups. 

There was only one significant effect 
among the 16 higher-order interaction 
terms. The delay-by-interference-by- 
groups interaction was significant at the 
5% level. This significant interaction 
was the result of the relatively large 
mean errors made by the normal group 
for the experimental conditions that 
combined delay and interference. The 





146 Journal of Speech and Hearing Research 


TaBLE 3. Summary of analysis of variance of the effects of position of the stimulus on visual spatia 


memory in normal, deaf, and aphasic children. In view of the homogeniety of the within-subject error 


terms in the overall analysis, all error variances were pooled into a single estimate for this analysis. 
(The between-subjects portion of this analysis is not shown since it is identical with the between-subjects 


portion of the analysis summarized in Table 2.) 











Source df ms F 
Within Subjects.....................05. 900 
BI WANGRY Soho cadsseesesdeeee cated 3 832.33 8.44* 
OGUIINB'(G) <5 svc ce ces deve dense 3 69.33 
5. ea ee 6 39.33 
PRs is ce-6:is.e-e waie.e-owee 4 sientawees 3 34.67 
MRR PRS or. 5 5 c0 os w-0aie sores ba caste ean 9 363.89 3.69* 
OLS 2 0 Ra rr 6 14.83 
NOPE fie kb es ose nese sdaee ee co eoie 3 55.33 
IRR TA case oa eed ecls bee s 6 37.77 
‘ibe o Cae Ce a ae ear hare 18 103.39 1.05 
PESAD A 6. aj5 se se aiereie oanrere. sie ave 0.8.8 9 87.67 
ONS C2), a ee ar 6 95.67 
BU KAOOK NS KA sce ccc eeewesves 18 38.89 
BETPOR UW) 6 o-s.0 c.ajos neicsd.ais.c'oe oe brnst-oeas 810 98.60 
SONU Soe eat ecairines dca, acaivd'n: 9) 404 Sucbcoterewiie. et 959 








* Significant at the 1% level. 


delay-plus-interference condition ap- 
parently disturbed the visual memory 
of the normal group somewhat more 
than it disturbed the visual memory of 
the deaf group or the aphasic group. 
Effect of Spatial Location. As de- 
scribed above, the location of the visual 
stimulus was systematically varied from 
trial to trial. Subjects with visual -field 
defects might have difficulty in locat- 
ing and in remembering the position of 
the spot of light in certain regions of 
the stimulus area, and since a higher- 
than-normal incidence of such defects 


might be expected in the deaf group or 


the aphasic group, magnitude of error 
was analyzed as a function of stimulus 
location in order to determine whether 
stimulus location had contributed to 
the errors made by the deaf children or 
the aphasic children. The spot of light 
had been presented once in each square 
inch of the 4” x 4” stimulus field, and 


therefore the 16 stimulus locations 
could be divided into four rows and 
four columns. This made possible the 
analysis of variance for stimulus loca- 
tion shown in Table 3. None of the 
interactions that involved groups were 
significant for this analysis, and it can 
be concluded that the performance of 
the deaf group and the aphasic group as 
compared with the normal group was 
not unduly influenced by the location 
of the .pot of light in the stimulus field. 
The significant F ratios for rows and 
for rows by columns resulted from the 
fact that the more difficult experimental 
conditions involving delay and interfer- 
ence were not equally distributed 
among the various stimulus locations. 


Discussion 


The experimental task was carefully 
designed in such a way that the aphasic 
children and the deaf children would 





v 


. 





Doehring: Visual Spatial Memory in Aphasic Children 


not be penalized by a failure to under- 
stand instructions or by being less 
familiar with the visual stimulus than 
were the normal children. The per- 
formance of the deaf children was very 
similar to that of the normal children, 
indicating that deafness does not se- 
riously impair the visual perceptual 
abilities that were involved in the ex- 
perimental task. Although the present 
experiment was primarily concerned 
with the performance of aphasic chil- 
dren, this result was interesting in itself, 
in view of statements to the effect that 
deafness in children results in basic 
changes in all sensory systems and a 
change in general perceptual ability 
(10). 

It is somewhat more difficult to inter- 
pret the performance of the aphasic 
children. Their memory for the posi- 
tion of the visual stimulus tended to be 
less accurate than that of the deaf chil- 
dren and the normal children under all 
experimental conditions, but their per- 
formance was no more severely dis- 
rupted than that of the deaf children 
and the normal children by delayed 
response and by interference with fixa- 
tion. No conclusions can be drawn re- 
garding the differential influence of 
exposure duration, since there was no 
significant change in the performance 
of any of the groups as a result of de- 
crease in the duration of exposure of 
the visual stimulus. The use of dura- 
tions short enough to produce some 
disruption of performance would be 
necessary in order for the differential 
effects of this variable to be properly 
assessed. 

The generally poorer accuracy in 
visual memory of the aphasic children 
may be related to the fact that the mean 
nonverbal IQ of the aphasic group was 


147 


about 10 points lower than that of the 
deaf group. For all three groups the 
correlation between IQ and_ total 
amount of error on the experimental 
task, although not significant (—.27 for 
the normal group, —.25 for the deaf 
group, and —.33 for the aphasic group), 
suggested that children with lower IQs 
tended to make larger errors. It thus 
appears possible that the aphasic group 
might have performed as accurately as 
the deaf group if the two groups had 
been matched on the basis of nonverbal 
1Q. As mentioned previously, how- 
ever, Bilger (personal communication) 
found that aphasic children tended to 
score lower than deaf children on the 
Advanced Performance Scale. It seems 
reasonable to assume, therefore, that 
with respect to their relative IQ level, 
the deaf children and aphasic children 
tested in this experiment were repre- 
sentative of the general population of 
deaf and aphasic children trained at 
CID. 

Some idea of the relative magnitude 
of the difference in overall accuracy 
exhibited by the aphasic group can be 
gained by reference to the difference 
between age groups. Figure 1 shows 
that the subgroup of aphasic children 
in the age range of 9 years, 8 months 
to 11 years, 7 months performed at 
about the same level of overall accuracy 
as the subgroups of normal children and 
deaf children in the age range of 7 
years, 8 months to 9 years, 7 months. 
At these age levels, then, the aphasic 
children were retarded by about two 
years with respect to the overall ac- 
curacy of their performance of the 
experimental task. 

When their spatial-location response 
was delayed for eight seconds, the per- 
formance of the aphasic children was 





148 Journal of Speech and Hearing Research 


no more severely disrupted than that of 
the other children. Thus, the aphasic 
children were apparently not retarded 
with respect to operation of the sym- 
bolic processes that are involved in this 
aspect of visual memory, even though 
their overall performance was less ac- 
curate than that of the other children. 
Further evidence for the normal opera- 
tion of symbolic processes is provided 
by the fact that interference with fixa- 
tion of the visual stimulus did not pro- 
duce an excessive disruption of the 
performance of the aphasic children. 
This experimental condition was used 
in order to insure symbolic mediation 
of the visual memory process during 
delayed response. 

The results of this experiment sug- 
gest that, with reference to the abilities 
of aphasic children, visual memory is 
a complex process. The group of apha- 
sic children who were tested in this 
experiment appeared to be normal with 
respect to the visual symbolic processes 
associated with delayed response, but 
somewhat retarded in general accuracy 
of visual memory for spatial location. 
These findings indicate that children 
classified as aphasic by the procedures 
used at CID are retarded in some, but 
not all, aspects of visual perceptual 
ability. It is not clear whether this re- 
tardation in general accuracy should be 
attributed to the aphasic condition it- 
self; or whether it should rather be at- 
tributed to the lower nonverbal IQ that 
appears to characterize aphasic children 
at CID. Further comparisons of the 
abilities of these children with the 
abilities of nonaphasic children would 
be necessary in order to specify the 
exact nonverbal deficiencies that ac- 
company verbal deficiencies in these 
children. 


Summary 


Accuracy of memory for the loca- 
tion of a visual stimulus as a function 
of delayed recall, interference with fixa- 
tion, and duration of exposure was 
investigated with a group of 20 children 
classified as aphasic, 20 normal children, 
and 20 deaf children. Results showed 
that the accuracy of performance of all 
three groups was unaffected by varia- 
tions in duration of exposure, was de- 
creased by about the same amount for 
each group by delay of recall and in- 
terference with fixation, and changed 
as a function of chronological age in 
about the same way for all three 
groups. The aphasic group was, how- 
ever, significantly less accurate than the 
deaf group and the normal group in 
terms of total amount of error. These 
results suggest that children classified 
as aphasic are retarded in some, but 
not all, aspects of visual perceptual 
ability. 


Acknowledgment 


The writer wishes to thank Dr. 
Joseph Rosenstein for his assistance in 
this experiment and Mr. Earl H. Gree- 
son, Principal, Delmar-Harvard School, 
University City, Missouri, for his co- 
operation in making available the group 
of normal children. 


References 


1. Bircer, R. C., Limitations on the use of 
intelligence scales to estimate the mental 
ages of children. Volta Rev., 60, 1958, 
321-325. 


2. Brair, F. X., A study of the visual mem- 
ory of deaf and hearing children. Amer. 
Ann. Deaf, 102, 1957, 254-263. 


3. Brown, S. (Ed.), The Concept of Con- 
genital Aphasia from the Standpoint of 





} 
t 





Doebring: Visual Spatial Memory in Apbasic Children 149 


Dynamic Differential Diagnosis. Wash- 
ington: Amer. Speech Hearing Ass., 
1959. 

. Duncan, D. B., Multiple range and mul- 
tiple F tests. Biometrics, 11, 1955, 1-42. 

. Epwarps, A, L., Experimental Design in 
Psychological Research. New York: 
Rinehart, 1950. 

. Gotpstein, R., Lanpau, W. M., and 
Kuerner, F. R., Neurologic assessment 
of some deaf and aphasic children. Ann. 


Oto. Rhino. Laryng., 67, 1958, 468-479. 


. Kuerrner, F. R., Teaching aphasic chil- 
dren. Education, 79, 1959, 413-418, 

. Lane, Heten S., and ScHNewer, JENNY- 
Louise L., A performance test for school- 


age deaf children. Amer. Ann. Deaf, 86, 
1941, 441-447. 


. McGinnis, Mivprep A., Kierrner, F. R., 


and Goxpstein, R., Teaching aphasic 
children. Volta Rev., 58, 1956, 239-244. 


; Myx.esust, H. R., Towards a new un- 


derstanding of the deaf child, Amer. 
Ann. Deaf, 98, 1953, 345-357. 


. Pintner, R., and Paterson, D., A com- 


parison of deaf and hearing children 
in visual memory for digits. J. exp. 
Psychol., 2, 1917, 76-88. 


. SULLIVAN, ExizaBetH T., Clarke, W. W., 


and Tiecs, E. W., California Test of 
Mental Maturity. Los Angeles: Calif. Test 
Bur., 1951. 





Vocal Pitch Variation 
Related.to Changes in Vocal Fold Length 


HARRY HOLLIEN 


Despite the opinion of Negus (7) 
who argued on anatomical grounds 
that the length of the vocal folds can- 
not change appreciably, the literature 
on vocal physiology reveals a consen- 
sus that vocal fold length and vocal 
pitch are related. Moore (5) observed 
a general lengthening of the vocal folds 
associated with rise in vocal pitch. In 
his article on the high speed motion 
pictures of the vocal folds produced 
at the Bell Telephone Laboratories, 
Farnsworth (2) reports a similar ob- 
servation. Examination of the recent 
Moore and von Leden films (6, 8) 
indicates the same general tendency. 
In none of the above studies, however, 
was an attempt made to obtain meas- 
urements relating vocal fold length 
changes to pitch variations. In fact the 
available measurements are quite mea- 
ger and are by no means in good agree- 
ment concerning the nature of this 
relationship. Irwin (4) measured laryn- 
geal photographs of one subject and 
found only a limited tendency for 
length to increase with pitch rise. The 
elongation he observed was relatively 





Harry Hollien (Ph.D., University of Iowa, 
1955) is Assistant Professor of Logopedics, 
rte base 4 of Wichita and Institute of Logo- 
pedics. This article is based on a portion of 
a doctoral dissertation completed at the Uni- 
versity of Iowa under the direction of Dr. 
James F. Curtis. 


Volume 3, No. 2 


150 





small and confined to a narrow range 
of low pitches. Above this narrow 
range he observed little or no change 
in length with variation in pitch. 
Brackett (1) measured vocal fold 
length from laryngeal photographs of 
two subjects under two conditions of 
pitch variation. When his subjects 
changed vocal pitch in a continuous 
glide an octave in extent, little or 
no length change was observed. On the 
other hand, when the subjects sang 
two discrete sustained tones separated 
by an octave the folds were found to 
be definitely longer for the higher 
pitch. 

The lack of data concerning this 
pitch and length-of-fold relationship 
may be attributed, in part at least, to 
two problems inherent in such in- 
vestigation. One problem is the diffi- 
culty in obtaining subjects who are 
able or willing to undergo the some- 
what unpleasant experience of learn- 
ing to suppress the gag reflex and 
expose the vocal folds to laryngo- 
scopic viewing. A second problem is 
that, in photographing the folds, the 


‘lens-to-field distance is variable and 


usually unknown. This distance not 
only varies from subject to subject 
due to differences in anatomical size, 
but also varies from pitch to pitch for 


June 1960 





( 
\. 


a single subject, since the elevation of 
the larynx may change with change 
in vocal pitch. Since the size of the 
photographed image is a function of 
this distance, it follows that measure- 
ments from laryngeal films are subject 
to errors resulting from these distance 
variations, unless the lens-to-folds dis- 
tance is known for each condition and 
suitable correction made. 

The purpose of this study was to 
obtain additional information concern- 
ing the relationship between vocal 
pitch change and vocal fold elongation. 
The study attempted measurement of 
a larger number of subjects than had 
been measured by previous investigators 
and employed a correction technique 
to minimize the errors resulting from 
variation in the lens-to-vocal-fold dis- 
tance. 


Procedure 


Subjects. The study here reported 
was a portion of a larger project utiliz- 
ing lateral x-ray and laminographic pro- 
cedures for studying anatomical and 














Ficure 1. Arrangement of equipment for 
laryngoscopic photography. C is a 16-mm 
motion picture camera; L, the light source; 
M-1, the reflecting mirror; M-2, the sub- 
ject’s observation mirror; and M-3, a laryn- 
geal mirror. 





Hollien: Pitch Change, Vocal Fold Length 151 





physiological factors associated with 
vocal pitch. The same four groups of 
subjects were used in all phases of this 
project. The procedures by means of 
which they were selected have been 
described in a previous article (3). In 
general the four groups were comprised 
of (a) Group LM, six very low pitched 
male voices; (b) Group HM, six very 
high pitched male voices; (c) Group 
LF, six very low pitched female voices; 
and (d) Group HF, six very high 
pitched female voices. 


Equipment. Equipment used in this 
study (see Figure 1 for schematic draw- 
ing) consisted of a standard #5 laryn- 
geal mirror; an auxiliary mirror in 
which the subject could observe the 
laryngeal mirror and thus know when 
his vocal folds were exposed; an ordi- 
nary parabolic head mirror to direct 
the light beam to the laryngeal mirror; 
a 500-watt light source with a condens- 
ing lens and prism; and an Eastman 
Cine-Kodak Special 16-mm motion pic- 
ture camera with a four-inch telephoto 
lens. A motion picture camera, rather 
than a single exposure camera, was used 
since it was thus possible to select the 
optimum frame for measurement from 
a sequence consisting of a large number 
of frames. Hence, the likelihood of ob- 
taining at least one photograph in 
which the vocal folds were sufficiently 
exposed to permit measurement was 
substantially increased. 


Selection and Control of Vocal 
Pitches. Each subject was required to 
phonate four pitch levels, three chosen 
to represent a distribution of levels 
within the normal pitch register and 
one from the falsetto register. The 








152 Journal of Speech. and Hearing Research 


pitches were specified in relation to 
the subject’s total pitch range and were 
located as proportions of the range 
above his lowest sustainable tone. The 
10, 25, 50, and 85% points to the near- 
est semitone were chosen for this pur- 
pose. In addition, an attempt was made 
to photograph each subject with his 
vocal folds in the abducted position. 

Control of the fundamental fre- 
quency of phonation was obtained by 
use of a reference tone at the proper 
frequency level provided by an ordi- 
nary chromatic pitch pipe with a range 
of Fs; to Fy. Because of the limited 
range of this device, training periods 
were held until each subject was able 
to produce the selected pitches in the 
proper octave. Subjects were requested 
to produce all vocal pitches at a ‘com- 
fortable’ loudness level. 


Anesthesia. As previously indicated, 
one of the major problems encountered 
in laryngeal photography is the gag 
reflex. In this study the criteria for 
selecting subjects did not include abil- 
ity to suppress this reflex. Accordingly, 
a light topical anesthesia was applied 
as needed to the soft palate and poste- 
tior pharyngeal wall in order to reduce 
gagging and allow comfort in using the 
laryngeal mirror. All anesthesia was 
administered under the supervision of 
a resident physician in the Department 
of Otolaryngology of the State Uni- 
versity of Iowa Hospitals. It is possible 
to object to the use of anesthesia on 
the grounds that natural function may 
be changed in some manner. However, 
this seems unlikely since a spray of the 
kind used has only a superficial effect, 
desensitizing the sensory nerve endings 
with little or no effect on motor func- 


tion. Moreover, the area to which the 
spray was applied was quite remote 
from the larynx. 


Photographic Procedure. Each sub- 
ject was seated before the laryngeal 
mirror on an adjustable stool and al- 
lowed to introduce this mirror into 
his pharynx to familiarize himself with 
the position and appearance of the vo- 
cal folds. During this period the ex- 
perimenter observed progress through 
the prismatic viewing system of the 
camera. By this means he was able 
to determine which of the conditions 
(that is, abduction or one of the con- 
ditions of phonation) allowed the 
subject to expose his vocal folds with 
the greatest ease. This condition was 
chosen as the initial one for photo- 
graphing. 

During a photographic run the ex- 
perimenter observed the image of the 
subject’s vocal folds through the cam- 
era’s viewing system and cued the 
subject with the frequency to be pho- 
nated. Once the subject phonated the 
desired pitch and, through manipula- 
tion of his head in relation to the lar- 
yngeal mirror, exposed the greatest 
possible extent of the vocal folds, the 
camera was triggered and a length of 
film was exposed. 


Preliminary experimentation showed 
that only a few subjects could expose 
their vocal folds unless their tongues 
were pulled forward and down. 


Hence, in order to make conditions 
comparable from subject to subject, 
each was asked to hold the tip of the 
tongue gently with a surgical gauze 
pad. Possible distortion thus intro- 
duced was recognized, but there was 
little choice in this respect since, with 








Hollien: Pitch Change, Vocal Fold Length 153 


the tongue left free, very few pictures 
would have been obtained. 


Measurements. The frames to be 
measured were selected after repeated 
viewing of the films shown by use of 
a Keystone K-160 16-mm motion pic- 
ture projector. Measurements were 
made by projecting the selected frames 
with a Griscombe Model TA micro- 
film reader. This projector reflected 
the desired image onto a table top so 
that tracings of the vocal folds could 
be made. Distance from the film to 
the table top was always constant so 
no correction for variation of this dis- 
tance was required. 

Two rough checks of reliability 
were made on the tracings of the vocal 
folds. First, each of a number of 
frames was traced two or more times 
to see if the structures involved could 








Figure 2. Schematic drawing of the larynx 
as seen in laryngoscopic ges gee is 
the epiglottis, F, the vocal folds; and T, 
the tubercles formed by the corniculate and 
arytenoid cartilages. Line A is drawn tangent 
to the most anterior extent of the vocal 
folds; and line B, tangent to the tubercles, 
is considered the posterior boundary. All 
measurements were made from A to B. 


be: delineated with confidence. Differ- 
ences from tracing to tracing were so 
small that ordinarily they could be ac- 
counted for by the width of the 
marking. In addition, for several sub- 
jects and conditions, several different 
frames of the same sustained tone were 
measured. Differences here were also 
small and indicated that the frame to be 
measured could be selected relative to 
other criteria. 


After all tracings had been com- 
pleted, the final selection was made 
of the frames to be measured. It was 
desirable that the tracing include the 
entire anteroposterior length of the 
folds and the bulk of frames measured 
met this criterion. In a few instances 
no frame was available in which the 
extreme anterior end of the folds was 
not obscured by the epiglottis. The 
tracings of such frames were extrapo- 
lated only if it was obvious that very 
little of the anterior portion was hid- 
den and if other frames allowed for 
confident delineation of the vocal fold 
outline in this area. 


Figure 2 is a schematic drawing of 
the view seen in laryngoscopic photog- 
raphy and illustrates the problems of 
determining boundaries for length 
measurements. In addition to the diffi- 
culty of locating the exact anterior 
termination of the vocal folds, there 
is also a problem in determining their 
posterior limits. This is true because 
the posterior portion of the folds lies 
between the arytenoids and change in 
the amount of adduction of these struc- 
tures can produce apparent change in 
this posterior border. Accordingly, 
measurements were made from a later- 
al line which was drawn tangent to the 








154 Journal of Speech and Hearing Research 


TaBLE 1. Measurements of the apparent length of the vocal folds during phonation and abduction. 
All measurements were corrected for variation in camera-to-vocal-fold distance and are reported in 
millimeters. Where there are no cell entries the photographs were inadequate to allow for valid meas- 











urements. 
Group Abducted Relative Fundamental Frequency of Voice 
Powtion Medium High —_—~Fallsetto 
(10%) (25%) (50%) (85%) 

Low Male (LM) 
LM-1 11.2 12.5 13.2 13.9 
LM-2 
LM-3 20.3 11.0 16.9 19.1 
LM-4 21.8 11.2 13.5 16.3 17.3 
LM-6 16.5 8.4 10.5 14.3 

High Male (HM) 
HM-1 13.3 10.0 9.9 14.1 14.4 
HM-2 14.8 10.9 10.5 17 12.1 
HM-4 8.6 
HM-5 15.7 12.3 
HM-6 

Low Female (LF) 
LF-1 10.5 7.9 8.1 10.0 
LF-2 13.1 
LF-3 11.8 7.1 7.8 17 11.5 
LF-4 12.0 7.2 9.9 11.3 10.7 
LF-6 11.0 8.2 6.9 9.6 10.8 

High Female (HF) 
HF-2 9.8 6.9 8.7 7.8 
HF-3 9.7 7.1 7.2 8.8 9.5 
HF-4 
HF-5 10.1 6.3 6.5 8.1 
HF-6 10.4 6.2 6.9 8.4 7.6 








anterior borders of the tubercles formed 


by the protrusions of the cornicu- 


late and arytenoid cartilages. Figure 
2 illustrates this procedure with line 
B defining the posterior limits of the 
vocal folds for this study. Although 
the boundary thus obtained is some- 
what arbitrary and the resulting meas- 
urements do not represent the total 
length of the vocal folds, the consis- 


tency with which B could be located 
dictated its use. Line A defines the 
anterior limits of the vocal folds. All 


measurements represent the distance 
from A to B. 


Corrections. As indicated, a source 
of measurement error is the variation 
in the distance from the lens to the 
vocal folds. This‘ variation results from 








= 











Hollien: Pitch Change, Vocal Fold Length 155 


anatomical size differences among sub- 
jects and from the relative elevation 
of the folds for different vocal pitches 
within subjects. A correction for this 
source of error was accomplished in- 
sofar as possible in the following 
manner. First, the approximate distance 
from the vocal folds to the laryngeal 
mirror was obtained from laminagraph- 
ic x rays which were available for 
each of the subjects and each of the 
experimental conditions. Second, a 
millimeter grid was photographed be- 
low the laryngeal mirror at spaced 
distances throughout the range of these 
estimates. From the projected images 
of these grids, a table of correction 
factors was developed which included 
corrections for lens-to-field variation 
and a constant to return the enlarged 
image of the vocal folds to life size. 


Results 


Table 1 presents the measurements 
of vocal fold length for each of the 
experimental conditions. These data 
are tabled by rows for individual sub- 
jects within each of the four pitch 
groups. Of the 24 subjects selected four 
could not be used. Subject LM5 had 
excessive lingual tonsillar growth, sub- 
ject HM3 reacted to the cocaine in 
the topical anesthesia, and subjects 
LF5 and HF1 did not report for this 
procedure. Of the remaining 20 sub- 
jects not all were able to expose their 
vocal folds satisfactorily for each of 
the experimental conditions. Hence, 
there are fewer than five measurements 
on a number of subjects. In fact, meas- 
ures were obtained for all five condi- 
tions on only eight of the subjects. 
In all 68 of a possible 100 values (one 


abducted and four phonating condi- 
tions for 20 subjects) were obtained. 
For this reason no statistical treatment 
of these data was attempted. 


Even with allowance for the limit- 
ed data available, some relationships 
are apparent. The findings show rather 
consistent differences among pitch level 
groups. At all frequencies average vo- 
cal fold length is of descending magni- 
tude from low male to high male to 
low female and finally to high female. 
These measurements are in substan- 
tial agreement with Hollien’s (3) find- 
ings for general laryngeal size. 


Systematic elongation of the vocal 
folds as the fundamental frequency of 
phonation is raised may also be seen. 
This trend is quite consistent and ex- 
cept for the falsetto condition only a 
few reversals are evident. However, 
the trend does not agree with Irwin’s 
(4) findings that the major extent of 
change is confined to the low end of 
the pitch range. Examination of the 
table indicates that for these subjects 
lengthening accompanies rise in pitch 
throughout the portion of the normal 
pitch register covered in this study. 


The first column of Table 1 lists 
the values of vocal fold lengths meas- 
ured with the folds in the abducted 
position. It is quite striking that in only 
one case does the length of the folds 
during phonation exceed these values. 
Hence, it would appear that the vocal 
folds are nearly always shortened 
during phonation as compared to their 
length when abducted. Only in rare 
instances does fold length during pho- 
nation approach that measured for the 
abducted position. This has direct 
bearing on the frequently asserted no- 








156 Journal of Speech and Hearing Research 


tion that to initiate phonation the 
vocal folds are adducted and stretched. 
If, as seen here, the folds actually are 
shortened when adducted, this concept 
will have to be re-examined. 

It must be remembered that this 
study examined only a few pitches 
throughout each subject’s range. The 
finer details of how length may change 
with relatively small changes in vocal 
pitch have not been investigated. Addi- 
tional studies are needed to cross 
validate the findings here presented 
and to investigate the relationships be- 
tween vocal pitch and vocal fold 
length in greater detail. 

Finally, these data are consistent with 
the myoelastic theory of voice produc- 
tion. That is, the lengthening trends 
reported undoubtedly would be re- 
flected in variations of the mass and 
thickness and possibly compliance of 
the vocal folds. If the myoelastic 
theory is correct, such factors must 
govern vocal pitch. On the other hand, 
the neurochronaxic theory, which ac- 
counts for pitch changes in terms of 
the rate of neural impulses to the vocal 
folds, would not predict the changes 
described above. 


Summary 


Four groups of subjects were chos- 
en primarily on the basis of pitch. 
range, age, ability to produce specified 
vocal tones easily, and absence of 
speech or voice defects. A laryngo- 
scopic photography procedure allow- 
ed for measurements of the length of 
each subject’s vocal folds under five 
conditions, one of no phonation, four 
of phonation, at four different funda- 
mental frequencies. Differences in vo- 


cal fold length among groups and 
among pitches for the same subject 
were determined. The resulting con- 
clusions are: (a) As the fundamental 
frequency of phonation is raised, the 
vocal folds systematically lengthen. 
(b) In the abducted position the vocal 
folds are as long as or longer than 
they are for any condition of phona- 
tion. (c) Low pitched individuals ex- 
hibit generally longer vocal folds 
than do individuals with higher pitch 
levels both between the sexes and with- 
in a sex. (c) The findings are general- 
ly consistent with the myoelastic 
theory of voice production. 


References 


1, Brackett, I., An analysis of the vibra- 
tory action of the vocal folds during the 
production of tones at selected frequen- 
cies. Ph.D. dissertation, Northwestern 
Univ., 1947. 


2. Farnswortn, D. W., High pee motion 
pictures of the human vocal cords. Bell 
: Lab. Rec. (and film), 18, 1940, 203-208. 


3. Hotuien, H., Some laryngeal correlates 
of vocal pitch. J. Speech Hearing Res., 
3, 1960, 52-58. 


4. Irwin, J. V., A study of the relationships 
between certain laryngeal mechanisms 
and voice pitch. M.A. thesis, Ohio State 
Univ., 1940. 


5. Moore, P., Vocal fold movement during 
vocalization. Speech Monogr., 4, 1937, 
44-55. 


6. Moore, P., and von Lepen, H., (motion 
picture) Larynx and Voice: Function of 
the Normal Larynx. Chicago: Gould 
Found., Laryngeal Res. Lab., 1956. 


7. Necus, V. E., The Mechanism of the 
Larynx, St. Louis: Mosby, 1929. 


8. von Leven, H., and Moore, P., (motion 
picture) Larynx and Voice: Physiology 
of the Larynx Under Daily Stress. Chi- 
cago: Gould Found., Laryngeal Res. 


Lab., 1958. 





Measurements of the Vocal Folds 


during Changes in Pitch 


HARRY HOLLIEN 


G. PAUL MOORE 


In a companion article one of the 
present authors (7) has reviewed the 
disagreement shown by previously 
reported data concerning the amount 
of length adjustment which accom- 
panies vocal pitch variation and has 
reported a series of measurements 
showing that some length adjustment 
appears to be a normal correlate of 
pitch change throughout the entire 
pitch range. However, this study em- 
ployed only four rather widely sepa- 
rated pitches for each subject so that 
data for smaller variations in pitch 
were lacking. 

A second feature of the data from 
the article previously referred to 
showed that the folds are nearly al- 





Harry Hollien (Ph.D., University of Iowa, 
1955) 1s Assistant Professor of Logopedics, 
University of Wichita and Institute of Logo- 
pedics. G. Paul Moore (Ph.D., Northwestern 
University, 1936) is Associate Professor of 
Speech Re-education, School of Speech, 
Northwestern University, and Director, 
Laryngeal Research Laboratory, William and 
Harriet Gould Foundation. This article is 
an adaptation of a paper presented before 
the ad hoc Research Committee of the 
Academy of Ophthalmology and Otolaryngol- 
ogy, October 1958, and a paper presented at 
the 1959 convention of the American Speech 
and Hearing Association, Cleveland. The re- 
search was supported by the William and 
Harriet Gould Foundation. 


Volume 3, No. 2 


157 


ways shorter during phonation than 
they are in the abducted position. This 
finding appears to contradict the usual 
notion that the conditions necessary 
for initiation of phonation are adduc- 
tion and ‘stretching’ of the folds. 
French (4), Behnke (1), von Meyer 
(17), and Howard (8), among others, 
have attributed this stretching action 
to the contraction of the cricothyroid 
muscles. On the other hand, Negus 
(13) held that these muscles were not 
powerful enough to account for this 
stretching action. Kenyon (10) pre- 
sented an alternative explanation by 
suggesting that it is the extrinsic laryn- 


_geal muscles which produce the stretch- 


ing action, and his view has been 
supported by Sokolowsky (/5), Russell 
(14) and others. If the data which 
show the folds to be contracted or 
shortened rather than lengthened and 
stretched in preparation for phona- 
tion are substantiated by further meas- 
urements, the previously stated con- 
cepts by Behnke, Kenyon, and others 
would appear to need re-evaluation. 
The present article reports further 
measurements similar to those obtained 
earlier by one of the present authors. 
However, the data herein reported 


June 1960 








158 Journal of Speech and Hearing Research 


were obtained with a more refined 
procedure and a larger number of 
pitches were studied, so that the in- 
formation is more complete and de- 
tailed, as well as more precise. 


Procedure 


The data for the present study were 
acquired by means of color motion 
picture photography of the larynges 
of six male subjects. Standard tech- 
niques for indirect laryngoscopic view- 



































— —F— ee ee eH ee 


a ee ee fe —F2 — + — 4 



































A SBC Oo & Ff 
Subjects 


Figure 1. Pitch ranges of the six subjects and 
the frequencies at which photographs were 
made. Frequency is given in musical notes 
and octaves, 


ing of vocal cords were employed. 
The resulting photographs were pro- 
jected and the images of the vocal 
folds were measured. 


Subjects. Subjects were six men rang- 
ing in age from 27 to 53 years who 
were able to expose the full anteropos- 
terior extent of their vocal folds while 
phonating fundamental frequencies en- 
compassing the bulk of their pitch 
ranges including falsetto. The vocal 
ranges of these subjects may be seen 
in Figure 1. Subject A may be roughly 
classified as a bass, subjects B and C 
as baritones, and subjects D, E, and F 
as tenors. All subjects had histories of 
being essentially free of laryngeal 
pathology. None had had formal sing- 
ing training. 


Equipment. The photographic ex- 
posures were made with an Arriflex 
16 motion picture camera having a 
six-inch telephoto lens on an exten- 
sion bellows. The lighting system 
consisted of a 2000-watt General Elec- 
tric incandescent lamp, a water cell 
cooler, and a focusing lens. The light 
beam thus produced was focused and 
reflected to the laryngeal mirror. Or- 
dinarily a standard #5 laryngeal mir- 
ror was used but where possible a 
specially constructed adjustable laryn- 
geal mirror with a built-in heater was 
substituted. 


Pitch Conditions. During the experi- 
ment each subject was requested to 
produce pitches at the musical tones 
of C, E, and A within each octave 
from the lowest to the highest musical 
tones sustainable. Vocal pitch was con- 
trolled by reference tones at the 
selected frequencies which were pro- 





Hollien, Moore: Vocal Fold Measurement 


duced by a Heathkit AG-9A audio- 
generator connected to a Magnasync, 
Model M-80 amplifier and speaker 
system. Since all of the subjects were 
relatively adept in matching a tone, a 
more elaborate cueing system was 
found unnecessary. No control of vocal 
intensity was attempted other than by 
requesting the subjects to produce all 
pitches at a ‘comfortable’ loudness 
level. 


Anesthesia. For two of the six sub- 
jects a light topical anesthetic was used 
on the soft palate and posterior pha- 
ryngeal wall to reduce the gag reflex 
and to allow comfort in using the 
laryngeal mirror. To insure comparable 
conditions from subject to subject and 
to facilitate full exposure of the vocal 
folds, each subject was requested to 
hold his tongue gently in a protruded 
position with a gauze square. 


Photographic Procedure. The sub- 
ject sat before a fixed laryngeal mir- 
ror on an adjustable stool and intro- 
duced the mirror into his pharnyx. He 
then viewed his larynx by means of 
an auxiliary mirror and familiarized 
himself with the position and appear- 
ance of his vocal folds. Concurrently, 
an experimenter observed the subject’s 
larynx through the camera viewing 
system and noted the vocal pitch that 
best allowed the exposure of the entire 
anteroposterior length of the vocal 
folds. This was chosen as the initial 
condition for photography. 

For each experimental condition an 
investigator observed the image of the 
subject’s vocal folds in the laryngeal 
mirror and checked the accuracy with 
which he matched the desired pitch. 
When the proper laryngeal image was 


159 


visualized and the required pitch pro- 
duced, the camera was triggered to 
expose the desired length of film. Prior 
to each such exposure an identifying 
number was photographed. This num- 
ber was entered into a log along with 
information pertaining to the pitch 
produced and any other remarks that 
would assist the investigators in the 
subsequent analysis of the film. 


Measurement Corrections. To make 
accurate measurements from laryngeal 
photographs it is necessary to know 
the relationship between the size of the 
photographed image on the film and 
the actual size of the structure to be 
measured. Usually this requires knowl- 
edge of the film-to-field distance, a 
measurement which cannot be readily 
obtained. Hollien (7) attempted to 
meet this problem by making correc- 
tions based on film-to-field estimates 
obtained from laminagraphic x rays. 

In the present study an improved 
procedure was possible by exploiting 
certain features of the Arriflex 16 
camera. With the telephoto lens and 
extension bellows, this camera allowed 
the investigators a magnified image of 
the folds and provided a very shallow 
depth of field. These two features 
made it possible to bring the photo- 
graphic image into very accurate focus 
and to carry out the following meas- 
uring procedure. After laryngeal 
photographs were made during each 
condition of phonation or abduction, 
the subject removed himself from the 
laryngeal mirror and while all focal 
adjustments remained constant, a milli- 
meter grid was brought into the 
photographic field previously occupied 
by the vocal folds and was adjusted 








160 Journal of Speech and Hearing Research 


vertically with a rack and pinion gear 
until an exact focus was obtained. This 
millimeter scale was then photographed 
and used later to measure the length 
of the folds. Specifically, these meas- 
urements were made by projecting the 
image of the grid for a given condi- 
tion onto a table top via an Ampro- 
matic Model 500, 2” x 2” slide projector 
with a special 16-mm film gate and 
an associated mirror system. The grid 
was then carefully traced and selected 
film frames showing the vocal folds 
of that particular experimental condi- 
tion were projected onto the grid. The 
measurements were then made directly 
in millimeters. Error due to variation 
in the distance from the vocal folds 
to the film (caused by the relative 
elevation of the folds for different 
pitches and subject to subject size 
differences) and the constant projector 
magnification factor thus were cor- 
rected in one process. Additional ac- 
curacy of these measurements was 
accomplished by employing an en- 
largement ratio of 12:1 to 14:1 which 
negated small errors. 


In order to evaluate the measuring 
procedure just described, a preliminary 
investigation was carried out. A 2% 
error was arbitrarily selected as the 
maximum deviation allowable. Photo- 
graphs of a grid were taken and meas- 
urements made while three parameters 
were varied. The first variable evalu- 
ated was the distance from the photo- 
graphic field to the film. It was found 
that a vertical variation of more than 
1 cm in either direction was necessary 
before errors as large as 2% were 
observable. The second parameter 
studied was variation in the angle of 
the laryngeal mirror in relationship to 


the grid (or vocal folds). Measure- 
ments here indicated that changes in 
mirror angle must exceed 3° in either 
direction before errors of the specified 
magnitude would be produced. Finally, 
the effect of changes in the horizontal 
angle of the grid (vocal folds) was 
investigated. Variation of this parame- 
ter in excess of 10° was necessary 
before the 2% limit of error was 
reached. It may be seen from the fore- 
going discussion that variations of the 
three parameters must be considerable 
before any appreciable error is pro- 
duced in the measurements. In addition, 
and even more important, blurring of 
the field takes place long before these 
limits are reached. Thus as long as the 
vocal folds are in focus over their 
entire anteroposterior length when they 





Figure 2. Schematic drawing of the larynx 
as seen in laryngoscopic photographs. E is 
epiglottis, F, the vocal folds, and T, the 
tubercles formed by the corniculate and 
arytenoid cartilages. Line A is drawn tangent 
to the most anterior extent of the vocal 
folds; line B tangent to a selected blood 
vessel; line C tangent to the arytenoid tu- 
bercles; and line D tangent to the most pos- 
terior boundary of the folds. Measurements 
were made from A to D, A to C, and B to C 
and are represented by lines 1, 2, and 3, 
respectively. 








Hollien, Moore: Vocal Fold Measurement 


are photographed, less than 2% error 
can be expected. 


Measurements. Measurements were 
made on three or more suitable film 
frames for each pitch for each subject; 
the average number of frames meas- 
ured per pitch per subject was six. 
Since there was no a priori evidence to 
show that any one of several possible 
reference points would be superior to 
any other as a basis of measurement, 
each frame was treated in three differ- 
ent ways. 


Figure 2 is a schematic drawing of 
the view seen in laryngoscopic pho- 
tography and illustrates the three sets 
of boundaries used in obtaining vocal 
fold dimensions. The first of these was 
the entire anteroposterior length from 
the anterior commisure to the most 
posterior border of the posterior 
commisure. This measurement was par- 
ticularly desirable because, if both 
these anterior and posterior bound- 
aries could be determined accurately, 
the resulting values would correspond 
closely to the actual vocal fold length. 
It is felt that the enlargement of the 
photographic images allowed such 
identification to be made with confi- 
dence; in any event, this dimension 
may be seen in Figure 2 as the distance 
from A to D and is represented by 
line 1. 


The second set of measurements is 
represented by line 2 and was made 
from the most anterior point of the 
folds (A) to a line drawn tangent to 
the anterior borders of the tubercles 
formed by the protrusions of the cor- 
niculate and arytenoid cartilages (C). 
Although this posterior boundary is 


161 


somewhat arbitrary and the resulting 
length measurements do not represent 
the total dimension of the vocal folds, 
the measurements have a consistency 
which is slightly better than for meas- 
urements obtained by the first ap- 
proach. The third type of measurement 
used artificial boundaries both anteri- 
orly and posteriorly and is represented 
by line 3. The posterior boundary was 
the same frontal plane (C) used in 
the second method above. The an- 
terior point was a selected blood ves- 
sel on the surface of one fold (B). A 
feature of this procedure is that, if a 
suitable blood vessel can be found, it 
is possible to obtain values without the 
possible error that could arise from 
attempts to determine the exact border 
line between the vocal folds and their 
adjacent structures. Unfortunately, 
suitable reference points of this type 
could be located on only four of the 
six subjects. 


The three sets of measures described 
above were organized in two ways. 
First, the maximum single measure for 
a subject at each of the vocal pitches 
and for abduction was recorded. Then 
the mean value of all measurements 
for that subject for each of these ex- 
perimental conditions was determined. 
Examination of the trends of each of 
these six tables showed them to be 
identical; that is, any relationship or 
trend on any one of the tables was 
apparent on any other. Even the mag- 
nitude of these trends varied but little 
and there were no reversals. Accord- 
ingly, a single table was chosen for 
evaluation, and it should be noted that 
any discussion of this selected table is 
equally applicable to any of the other 
sets of data. 





162 Journal of Speech and Hearing Research 


TasLE 1. Mean measurements (in millimeters) of the anteroposterior length of the vocal folds of six 
subjects during phonation and for abduction, made from laryngeal photographs. Where there are no 
cell entries, the pitch was outside the subject’s range. 











Condition Subjects 
A B Cc D E F 

65 cps (C2) sag 

82 cps (E) 13.8 9.4 7 
110 cps (Ag) 15.2 9.8 12.7 11.8 8.7 8.9 
130 cps (C3) 16.0 10.0 14.1 13.1 11.9 10.0 
170 cps (Es) 17.9 11.9 17.3 13.4 13.0 11.9 
220 cps (As) 16.47 12.07 19.4 14.4 14.0 12.6 
260 cps (Cy) 17.2 13.0 20.5 15.6f 14.9 13.0 
330 cps (Ey) 17.3 13.0 16.9f 14.2 12.2} 14.3f 
440 cps (Ay) 17.8 13.5 16.6 t* 11.3 10.2 
520 eps (Cs) ag 12.2 14.8 oa 9.6 
660 cps (Es) 10:67" 14.7** 1d ibs 
880 cps (As) it 
Abduction 23.0 17.3 22.3 20.3 19.1 19.8 








*No measurable frames. 
{First falsetto frequency. 
**Interpolated measures. 


Results and Discussion 


In Table 1 are listed the means of 
the vocal fold measurements of the 
total anteroposterior length for each 
of the vocal pitch conditions and for 
abduction. Considering only pitches in 
the normal or natural register, inspec- 
tion of any column will show that as 
expected (3, 9, 11, 12, 16) the vocal 
folds elongate systematically as pitch 
is increased. Subject F’s folds, for ex- 
ample, lengthened from 8.9 mm at 110 
cps to 13.0 at 269 cps. Subject C’s 
elongated from 12.7 mm to 20.5 mm 
for the same frequencies. These data 
support the classic hypothesis that the 
vocal folds are lengthened with rise in 
pitch. On the other hand, inspection 
of adjacent cells shows that the differ- 


ence in vocal fold length between 
pitches is often very small. 

Inspection of the values reported 
for the falsetto pitches indicates that 
the lengthening process for this regis- 
ter, unlike that for the natural regis- 
ter, follows not one but three patterns. 
In the first pattern (subject A), the 
vocal folds were shortened for the 
lower falsetto pitches compared to 
their length for the higher tones in the 
natural register. Subsequently, the 
lengthening pattern was continued as 
the fundamental frequency of falsetto 
phonation was raised. In the second 
pattern (subject B) the systematic 
lengthening of the natural register was 
continued into falsetto, but at the very 
high falsetto pitches the trend was re- 
versed and the folds became progres- 








six 
no 


as 


Hollien, Moore: Vocal Fold Measurement 163 


sively shorter. Finally for the third 
pattern (subject C) the folds became 
shorter immediately when the shift to 
falsetto was made and then continued 
to shorten progressively as fundamental 
frequency was further increased. 
Hence, while the classic pattern of 
the elongation of the vocal folds with 
increases in vocal pitch appears to be 
borne out in the natural register, the 
same is not true of the falsetto. 

No evidence can be found to sup- 
port the notion that the major portion 
of lengthening occurs in a particular 
part of the pitch range. Apparently 
the elongation process varies from 
subject to subject, so that some indi- 
viduals show greatest lengthening in 
the low frequencies, while others show 
maximum variations in the middle fre- 
quencies, and still others do not con- 
form to a specific pattern. 

In the companion article previously 
mentioned (7) one of the authors pre- 
sented measurements showing that the 
vocal folds are shortened for phona- 
tion rather than stretched relative to 
their abducted length. A comparison 
of the measurements of the folds in 
the abducted position with those dur- 
ing phonation indicates that this con- 
clusion is fully supported by the data 
here reported. 

The vocal fold lengths of all sub- 
jects in any specific vocal pitch show 
considerable variation. As an example, 
the mean anteroposterior length at 130 
cps varies from 10.0 mm for subjects 
B and F to 16.0 mm for subject A. 
Hollien (6) found that the thickness 
of the vocal folds at a specific funda- 
mental frequency was approximately 
the same from subject to subject ir- 
respective of the pitch level of voice. 


100 





90 r 

80 

Percent 70 
of 


Rest 
Length 














10 20 30 40 50 60 
Percent of Toto! Range 


Ficure 3. Variation in vocal fold length with 
change in vocal pitch. Vocal fold lengths 
during phonation are plotted by subjects as 
percentages of the abducted length. Relative 
frequency level values are plotted for the 
natural register only as percentages of the 
subject’s pitch range above his lowest sus- 
tainable tone. 


Apparently this is not true for the 
length parameter of the vocal folds. 

Curtis' in evaluating Brackett’s (2) 
study has commented that the data tend 
to suggest a ‘stair step’ function in the 
lengthening patterns of the vocal folds. 
That is, it would appear that the proc- 
ess of elongation is not smoothly con- 
tinuous but rather that abrupt length 
adjustments are made _ periodically. 
The measurements reported here tend 
to support this notion and although 
it could be conjectured that this re- 
lationship would be more apparent if 
the pitch intervals were not quite so 
gross, several of the subjects show 
little or no difference between adjacent 
cells at certain portions of their fre- 
quency range. 

The findings (5, 6, 7) that laryngeal 
size and vocal fold length correlate 


*Personal communication. 








164 Journal of Speech and Hearing Research 


highly with an individual’s pitch level 
are only partially supported by these 
data. If subject B is omitted, there is a 
trend, for any pitch condition in the 
natural register and the abducted posi- 
tion, of decreasing length from the 
lower pitched voices to the higher. Ex- 
amination of vocal fold lengths at 130 
cps will show that subject A’s length 
is 16.0 mm; subject C’s is 14.1 mm, and 
for the three tenors, 13.1, 11.9, and 
10.0 mm, respectively. Subject B is a 
notable exception and the authors are 
unable to explain this variation from 
the trend. 


Figure 3 is a graphic representation 
of the values for the normal register 
in Table 1 with all scores transformed 
to percentages in order that the rela- 
tionships may be more easily observed. 
That is, since the subjects’ pitch ranges 
were dissimilar and since the absolute 
vocal fold length for each subject was 
different at a given frequency, the ab- 
solute frequency levels were converted 
to relative values. Hence, each fre- 
quency is plotted along the abcissa as 
a percentage of that subject’s total 
pitch range including falsetto. In ad- 
dition, vocal fold lengths (in milli- 
meters) for each subject-pitch condi- 
tion are plotted as a percentage of the 
abducted length of that subject’s folds. 
For example, the lowest frequency at 
which subject A’s folds could be satis- 
factorily photographed is 11% of his 
total pitch range above the lowest tone 
he can sustain, and the length of his 
vocal folds for this pitch is 60% of 
their length in the abducted position. 

It becomes apparent (Figure 2) that 
for none of the conditions of phonation 
are a subject’s vocal folds as long as 
they are for that subject at rest. It is 


evident also that the vocal folds elon- 
gate progressively with increases in 
vocal pitch. Further, it may be noted 
that the curve for any subject is very 
similar to the curve for any other sub- 
ject even though the group as a whole 
is dissimilar in terms of pitch level and 
pitch range. This is especially true of 
the lower frequencies where it would 
seem that a single curve could ade- 
quately represent all six subjects. For 
the middle to high frequencies, this 
trend may be seen also although it is 
not quite as striking. 

Finally, it must be pointed out that 
these results are generally consistent 
with the myoelastic theory of voice 
production. On the other hand, the re- 
lationships here demonstrated are 
neither required nor explained by the 
neurochronaxic conceptualization of 
phonation. 


Summary 


Six adult males, who had demon- 
strated the ability, were required to 
expose the entire anteroposterior length 
of their vocal folds while phonating 
three notes within each octave of their 
pitch range. Measurements were made 
from laryngoscopic photographs. 

From these measurements the fol- 
lowing conclusions are drawn: (a) 
length of vocal folds increases sys- 
tematically with increases in vocal 
pitch for the natural register; (b) vo- 
cal folds in abduction are longer than 
in phonation; (c) no single pattern of 
elongation or shortening is apparent 
for pitches in the falsetto register; 
(d) magnitude of lengthening is no 
greater in one portion of the pitch 
range than in any other; (e) there is 





on- 

in 
ted 
ery 
ub- 
ole 
and 
of 
uld 
de- 
For 
this 
t is 


‘hat 
ent 
ICE 


are 
the 
of 


on- 
to 


ing 
\eir 
ade 


‘ol- 
(a) 
ys- 


cal 


Hollien, Moore: Vocal Fold Measurement 165 


some evidence of a ‘stair step’ length- 
ening function; (f) general vocal fold 
length appears to bear a moderate re- 
lationship to pitch level but does not 
correlate with the absolute fundamental 
frequency being phonated. These data 
tend to support the myoelastic theory 
of voice production. 


References 


1. 


Beunke, E., The Mechanism of the Hu- 
man Voice. (14th ed.) London: J. Cur- 
wen, 1900. 


. Brackett, 1., An analysis of the vibra- 


tory action of the vocal folds during the 
production of tones at selected frequen- 
cies. Ph.D. dissertation, Northwestern 
Univ., 1947. 


. Farnswortu, D. W., High speed motion 


ictures of the human vocal cords, Bell 
ab. Rec. (and film), 18, 1940, 203-208. 


. Frencu, T. R., On a perfected method 


of photographing the larynx. N. Y. med. 
J., 40, 1884, 653-656. 


. Hotuen, H., Some laryngeal correlates 


of vocal pitch. J. Speech Hearing Res., 
3, 1960, 52-58. 


. Hoxuen, H., A study of some laryngeal 


correlates of vocal pitch. Ph.D. disserta- 
tion, Univ. Iowa, 1955. 


ae 


16. 


17. 


Ho.tien, H., Vocal pitch variation re- 
lated to changes in vocal fold length. 
J]. Speech Hearing Res., 3, 1960, 150-156. 


. Howarp, J., Physiology of Artistic Sing- 


ing. Boston: Howard, 1886. 


. Irwiy, J. V., A study of the relationships 


between certain laryngeal mechanisms 
and vocal pitch. M. A. thesis, Ohio State 
Univ., 1940. 


. Kenyon, E. L., Relation of oral articu- 


lative movements of speech and of ex- 
trinsic laryngeal musculature in general 
to function of vocal cord. Arch. Oto- 
laryng., 5, 1927, 481-501. 


. Moore, P., Vocal fold movement during 


vocalization. Speech Monogr., 4, 1937, 
44-55. 


. Moore, P., and von Lepen, H., (motion 


picture) Larynx and Voice: Function of 
the Normal Larynx. Chicago: Gould 
Found., Laryngeal Res. Lab., 1956. 


. Necus, V. E., The Mechanism of the 


Larynx, St. Louis: Mosby, 1929. 


. Russert, G. O., Speech and Voice. New 


York: Macmillan, 1931. 


. SoxoLtowsky, R. R., Effect of the ex- 


trinsic laryngeal muscles on voice pro- 
duction. Arch. Otolaryng., 38, 1943, 355- 
364. 

von Lepen, H., and Moore, P., (motion 
picture) Larynx and Voice: Physiology 
of the Larynx Under Daily Stress. Chi- 
cago: Gould Found., Laryngeal Res. 
Lab., 1958. 

von Meyer, G. H., The Organs of 
Speech. New York: Appleton, 1884. 








Reliability of Language Measures 


and Size of Language Sample 


FREDERIC L. DARLEY 


KENNETH L. MOLL 


The study here reported was designed 
to answer the question of how large a 
sample of children’s connected speech 
must be elicited in order to obtain 
reasonably reliable scores representing 
the average length and the structural 
complexity of linguistic utterances. 


Two Language Measures 


Studies of the manner in which chil- 
dren’s language develops have typically 
included some description of connected 
speech samples in terms of amount of 
verbal output and grammatical com- 
plexity of sentences used. Two meas- 
ures of children’s linguistic achievement 
which have been used in several studies 
are the mean length of response and 
the structural complexity of response. 


Mean Length of Response. In the first 
of a series of related studies on chil- 
dren’s linguistic skills done at the Uni- 


versity of Minnesota Institute of Child’ 


Welfare, McCarthy (5) wrote down 
verbatim 50 consecutive verbal re- 





Frederic L. Darley (Ph.D., University of 
Iowa, 1950) is Associate Professor, Depart- 
ment of Speech Pathology and Audiology, 
University of Iowa. Kenneth L. Moll (M.A., 
University of Iowa, 1959) is Research As- 
sociate, Department of Speech Pathology and 
Audiology, University of Iowa. 


Volume 3, No. 2 


166 


sponses elicited from each of her 140 
subjects. In the hope of obtaining spon- 
taneous responses she used picture 
books and toys to overcome self-con- 
sciousness and establish rapport, and she 
addressed the child as little as possible 
during the observation. McCarthy de- 
veloped a set of rules for identifying 
a response, counting the words in each 
response, and classifying each response 
with regard to grammatical complete- 
ness and complexity. 


McCarthy used mean length of re- 
sponse (MLR) as her main measure of 
children’s linguistic achievement, a 
measure earlier advocated by Nice (6) 
and used by Smith (8) in her analysis 
of hour-long samples of spontaneous 
speech recorded by hand in a group 
free-play situation. Nice (6) had sug- 
gested that ‘this average sentence length 
may well prove to be the most impor- 
tant single criterion for judging a 
child’s progress in the attainment of 
adult language.’ McCarthy (5, p. 50) 
originally called MLR ‘the simplest and 
most objective measure of the degree 
to which children combine words at 
the various ages’; more recently (4, pp. 
550-551) she has stated that no measure 
‘seems to have superseded the mean 
length of sentence for a reliable, easily 


June 1960 





Darley, Moll: Language Measure, Sample Size 


determined, objective, quantitative, and 
easily understood measure of linguistic 
maturity.’ 

McCarthy (5, p. 46) calculated the 
reliability of the responses by correla- 
ting the odd- with the even-numbered 
responses in order ‘to see how consist- 
ently the children used responses of a 
certain length.’ She reported (5, p. 47) 
that ‘the mean reliability coefficient for 
the analysis according to length of re- 
sponse was +.91, the range being from 
+.82 to +.97 for the various age levels’ 
(seven levels at six-month intervals 
from 1.5 years to 4.5 years). She also 
broke her samples of 50 responses into 
five groups of 10 responses each (first 
10, second 10, . . .). She found (5, pp. 
66-67) that ‘the children’s responses 
tended to be somewhat shorter at first, 
but that there is little change in the 
mean length after the first ten or 
twenty responses.’ 


Structural Complexity of Response. 
McCarthy’s (5, p. 42) analysis of re- 
sponses classified into structural 
complexity categories (functionally 
complete but structurally incomplete 
responses, simple sentences without 
phrase, simple sentences with phrase, 
compound sentences, complex sen- 
tences, elaborated sentences, and in- 
complete responses) was adopted in 
an attempt ‘to indicate the stage of 
grammatical complexity that the child 
has reached, or in other words, how 
closely his sentence structure ap- 
proximates adult conversation, his sole 
criterion upon which to model his 
speech.’ She reported high agreement 
among four judges in classifying re- 
sponses with regard to structural com- 
plexity and considered the method 
reliable enough for use with groups. 


167 


‘Day (2) used McCarthy’s proce- 
dures, definitions, classifications, and 
methods of analysis in her: study of 
two-to-five-year-old twins, again re- 
cording manually 50 consecutive verbal 
responses. Davis (1) used McCarthy’s 
procedures with minor modifications in 
her study of 436 children (twins, single- 
tons with siblings, and only children) 
at three age levels, 5.5, 6.5, and 9.5 
years. She too collected samples of 50 
responses, usually consecutive, and for 
their analysis she amplified and clarified 
McCarthy’s definitions and rules for 
Sentence classification. Most recently 
Templin (10, p. 15) ‘duplicating as 
nearly as possible the technique de- 
veloped by McCarthy’ elicited 50 ver- 
bal utterances, usually consecutive, 
from 480 children, 30 boys and 30 girls 
at each of eight age levels (3, 3.5, 4, 4.5, 
5, 6, 7, and 8 years). Her analysis in 
terms of MLR and structural complex- 
ity was based upon Davis’ modification 
of McCarthy’s rules. 


In an earlier study Williams (12) 
used measures comparable to but not 
identical with those developed by Mc- 
Carthy and used by Day, Davis, and 
Templin. He made a phonetic tran- 
script of the spontaneous speech of 
children in a group play situation, the 
sampling unit (12, p. 10) being ‘what 
seems preferable to be called an expres- 
sion unit rather than a sentence... . 
Forty such units were sampled for each 
child.’ Williams calculated the mean 
length of the expression units and de- 
vised two quantitative scores to indi- 
cate expression unit completeness and 
complexity. ‘The classifications of un- 





*The most ee ton of these 
rules is to be found in Templin (J0, pp. 160- 
161). 





168 Journal of Speech and Hearing Research 


intelligible, incomplete, and complete 
were weighted arbitrarily as 0, 1, and 
2.... Arbitrary weights of 0, 1, 2, 3, 
and 4 were given to unintelligible, 
simple, [compound], complex, and 
compound-complex units.’ 


Templin also devised a quantitative 
method of representing sentence 
completeness-complexity. She assigned 
weights as follows to the categories of 
the McCarthy-Davis outline in order to 
obtain a structural complexity score 


(SCS): 


Classification 
Weight of Remark 


0 Incomplete remarks (even if 
functionally complete) 


1 Simple sentences (with or without 
phrases) 


2 Simple sentences with two or more 
phrases or a compound subject or 
predicate with a phrase 


3 Compound sentences 
Complex and elaborated sentences 


Templin found the SCS to be not quite 
as stable as other language measures 


used in her study, but she points out ° 


its advantage in being a quantitative 
measure, permitting comparison with 
other quantitative language measures 
such as MLR and vocabulary size. 
Further use of MLR and SCS, gen- 
erally following Templin’s procedures, 
has been reported by Spriestersbach, 
Darley, and Morris (9) in a study of 
the language skills of 40 children with 
clefts of the palate and by Winitz (13) 
in a study of the language skills of 150 
normal kindergarten boys and girls. In 
both of these studies 50 responses were 
elicited in the usual way, but the pro- 
cedures followed by Templin and her 
predecessors were modified in the light 
of McCarthy’s finding that children’s 


first 10 or 20 responses tend to be 
shorter than subsequent responses: the 
first 10 responses were disregarded, the 
next 50 being recorded. Winitz (13) 
tape recorded his samples for analysis 
and also provided supplementary defi- 
nitions and directions for counting and 
classifying words and remarks. 


Size of Language Sample 


The research and clinical usefulness 
of MLR and SCS as indices of chil- 
dren’s linguistic developmental status is 
somewhat limited by the expenditure 
of time required to record and analyze 
the by now almost traditional 50 re- 
sponses. The question arises as to 
whether equally reliable information 
can be obtained from an analysis of 
fewer than 50 responses. 


Nice (6) offered the following com- 
ment concerning size of speech sample: 
‘Although as few as 30 such sentences 
ought to show clearly a child’s stage of 
speech development, for any compar- 
ative or detailed study there should be 
at least 100 sentences and preferably 
more.’ McCarthy (5, p. 32) explained 
her choice of speech sample size (50 
responses) thus: “This number was de- 
cided upon because it would give a 
fairly representative sample of the 
child’s linguistic development in a rel- 
atively short period of time, without 
tiring the child with a prolonged ob- 
servation.’ She commented that the 
reliability of structural complexity 
measures would be increased if longer 
samples of each child’s conversation 
were obtained. Day, Davis, Templin, 
Spriestersbach, Darley, and Morris, and 
Winitz do not discuss their choice of 
similar sample size. 


It was noted above that Williams 








=e 


Darley, Moll: Language Measure, Sample Size 


169 


TaBLE 1. Means and standard deviations for two language measures, mean length of response (MLR) 
in words and structural complexity score (SCS) in scale points, computed from language samples of 
seven different sizes obtained from 150 normal five-year-old children. 











Measure _ Statistic Number of Responses in Sample 
5 10 15 20 25 35 50 
MLR Mean 5.87 5.63 5.84 5.76 5.56 5.54 5.62 
sD 3.2 2.3 2.3 2.1 1.9 1.9 1.8 
scs Mean 39.93 39.76 41.18 40.33 39.45 39.06 40.40 
sD 30.45 22.89 20.78 18.76 17.15 15.52 14.39 








(12, p. 10) used only 40 expression units 
in his study and considered these to 
constitute ‘a fairly representative sample 
of the child’s expressive control over 
language.’ Schneiderman (7) in a study 
of the relationship between articulatory 
ability and language ability used for her 
measure of sentence length the mean 
length of 15 sentences (oral definitions 
of 15 common nouns after a practice 
session defining five other nouns). The 
question may be asked as to whether 
it is possible that a sample as small as 
10 or 15 responses may yield MLR and 
SCS values dependable for group or 
individual prediction. 


Procedure 


A large group of connected speech 
protocols was available following the 
completion of the study reported by 
Winitz (13). The protocols represented 
50 responses elicited from each of 150 
randomly selected, physically normal, 
white, five-year-old kindergarten chil- 
dren (75 boys, 75 girls) from mono- 
lingual Iowa City homes. The two sex 
groups were essentially equivalent with 
regard to chronological age, IQ, socio- 
economic status, and family con- 
stellation. The sexes-combined mean 
chronological age was 63.50 months, 


mean Full Scale IQ on the Wechsler 
Intelligence Scale for Children was 
100.51, and mean score on the Index of 
Status Characteristics (11) was 42.29 
(lower middle class). Of the total 
group 11 were only children, 48 had 
older siblings, 38 had younger siblings, 
and 53 had both older and younger 
siblings; none were twins. All children 
were considered normal in intelligence 
(had Full-Scale WISC IQs of 70 or 
above), were currently considered by 
their parents not to stutter, and were 
found by a hearing-screening proce- 
dure with a pure-tone audiometer to 
have normal hearing. Children of uni- 
versity students and rural dwellers were 
not included in the sample. 


The 50 consecutive responses were 
elicited and tape recorded in each 
child’s home through the presentation 
of Children’s Apperception Test cards 
one at a time, the examiner engaging 
the child in conversation with neutral 
comments such as ‘Tell me about this 
picture.’ Only the examiner was present 
with the child. 

Used in the present study were the 
typewritten transcripts of the tape- 
recorded responses and from these 
MLR and SCS were calculated (a) on 
the basis of the first five, first 10, first 





170 Journal of Speech and Hearing Research 


15, first 20, first 25, first 35, and the total 
of 50 responses and (b) for each of 10 
groups of five responses, the first five, 
the second five, ... , through the tenth 
five responses, each group of five re- 
sponses being considered a response 


segment. 


Results 


Analysis of Samples of Increasing 
Size. Table 1 presents the means and 
standard deviations for MLR and SCS 
calculated for speech samples of seven 
different sizes (5, 10, 15, 20, 25, 35, and 
50 responses). Inspection of the means 
indicates that the differences between 
them are small: the range of mean 
MLR scores is from 5.54 words to 5.87 
words, while the range of mean SCS 
values is from 39.45 points to 41.18 
points. On the basis of these data it 
appears that for a group of subjects the 
group mean would remain essentially 
the same regardless of the number of 
responses on which the scores are based. 


Further inspection of Table 1 shows - 


that the fewer the number of responses 
obtained, the greater the variability of 


TaBLE 2. Summary of analyses of variance per- 
formed on mean length of response and structural 
complexity scores for 10 five-response segments. 








Source df ms 





Mean Length of Response 
Response Segments (RS) 9 2397.12 


Subjects (S) 149 3273 .42 
RS xS 1341 493 .42 
Total 1499 
Structural Complexity Score 
Response Segments (RS) 9 13.30 
Subjects (S) 149 20.84 
RS xS 1341 6.52 
Total 1499 








the scores. This added variability can 
be accounted for primarily by the ad- 
dition of greater ‘measurement error’ 
as the number of responses used be- 
comes smaller. This finding demon- 
strates that the primary consideration 
in reaching a decision about the number 
of responses to use is the reliability of 
the individual scores obtained from 
varying numbers of responses. 


Analysis of Response Segments. In 
order to study the reliability of these 
language measures, an analysis of vari- 
ance was carried out on the MLR and 
SCS values based on 10 separate re- 
sponse segments in a Response-Seg- 
ments-by-Subjects design (Table 2). 
From the analysis of variance data sev- 
eral reliability estimates were obtained 
by the following formula (3, p. 361): 


"tk (me ag xg) / (re, * fe - i} mw) 


where n = number of response segments used 
in sample (that is, n=10) 


k = any number of response segments 
from zero to infinity 


r_ — = estimated reliability coefficient of 
X,X;, the mean of k response segments 


The interaction mean square (msgsgxs) 
was considered to be the appropriate 
error term in this reliability analysis, 
rather than the total within-subjects 
variability, estimated by mspgxs + MSps. 
When scores are based on the same 
response segments for all subjects and 
when stimulus materials (which could 
cause systematic variability between 
response segments) are standardized, 
between-response-segments variability 
is a constant and does not represent 
error. 


By varying the value of k in the 
above formula it was possible to esti- 
mate the reliability of the mean of any 








ae 


Darley, Moll: Language Measure, Sample Size 


TaBLE 3. Estimated reliability coefficients of 
means of k response segments (five responses 
per segment) for mean length of response (MLR) 
and structural complexity score (SCS). 











k MLR SCS 
1 36 18 
2 58 31 
3 63 -40 
4 69 47 
5 74 52 
6 77 57 
7 .80 .61 
8 .82 64 
9 84 -66 

10 85 .69 

11 86 otk 

12 .87 12 

13 88 74 

14 .89 75 

15 .89 77 

16 .90 78 

17 91 79 

18 91 80 

19 91 81 

20 92 81 

25 .93 85 

30 94 87 
& 95 88 

40 96 90 

45 96 91 

50 97 92 








particular number of response seg- 
ments. These estimated reliability co- 
efficients appear in Table 3 for the two 
language measures studied and are 
plotted, together with their 98% con- 
fidence limits, as a function of the num- 
ber of response segments in Figure 1. 


Discussion 


The results of the reliability analysis 
indicate that the decision as to how 
many responses to elicit in order to 
obtain MLR and SCS values depends 
on the precision needed in a particular 
situation. For 50 responses (10 response 


171 


segments), the number most commonly 
used in previous research, the reliability 
of MLR scores (.85) seems adequate 
for most purposes; however, the reli- 
ability of SCS values derived from this 
number of responses (.69) may repre- 
sent less precision than is desired in 
some situations. 

Increasing the number of responses 
taken would improve the reliability of 
the scores. For MLR, however, the 
curve in Figure 1 begins to plateau 





MLR 


Obtained reliability coefficient 


—— 98% confidence limits for 
reliability coefficient 


Reliability Coefficient (rx, x,) 
A . 
° 














20F 
AOF 
n 1 l 1 1 sr nl = 
10 20 30 40 50 
Number of Five - Response Segments 
100 - 
90 + 
~. 80 F 
D | 
M70 + 
= 
= 60} 
= / 
= 50 L / scs 
3 / —— Obtained reliability coefficient 
Oo 405 iy === 98% contidence limits for 
= reliability coefficient 
= 30 
= 
ec .20 
t 
10 r 
' ° i i i i iL i 3 





10 20 30 40 so 
Number of Five - Response Segments 


Ficure 1. Estimated reliability coefficients for 
mean length of response (MLR)and structur- 
al complexity score (SCS), together with 
their 98% confidence limits, as a function of 
number of five-response segments. 








172 Journal of Speech and Hearing Research 


soon after 50 responses. Thus, a fairly 
large increase in number of responses 
would be required to improve reliabil- 
ity appreciably. For SCS the use of a 
few more responses would bring about 
a sizable change in measurement pre- 
cision. 

For some research it may be neces- 
sary to elicit fewer than 50 responses. 
A necessary consequence of this re- 
striction of sample size is loss of pre- 
cision resulting in an increase in the 
error term of any statistical test used. 
The loss of precision can be overcome, 
to some extent, by the use of more 
subjects; however, obtaining and test- 
ing a larger number of subjects would 
seem to be a less efficient procedure, in 
terms of the time and effort required, 
than eliciting more responses from a 
smaller number of subjects. 


The curves in Figure 1 indicate that 
SCS values are less reliable than MLR 
scores when both are based on the 
same number of responses, possibly be- 
cause the structural complexity of a 
child’s speech is a more variable phe- 
nomenon from response to response 
than is length of response. It is also pos- 
sible that the arbitrary weights assigned 
in the Templin scheme to the different 
grammatical categories may not be en- 
tirely appropriate. 

In general, it appears that eliciting 50 
responses is adequate for obtaining 
MLR scores but that more than 50 
responses should probably be obtained 
if SCS values are desired. Interpreta- 
tion of the results of this study must 
be made with the realization that lan- 
guage samples of children of only one 
age were used. The fact that the stand- 
ard deviations reported by Templin 
(10, pp. 79, 82) for MLR and SCS are 


fairly similar at all age levels from three 
to eight years suggests that the reli- 
ability of these language measures 
probably does not vary greatly with 
age within the age range studied by 
Templin. 


Summary 


Speech samples collected from 150 
five-year-old kindergarten children 
were used to evaluate the reliabilities 
of two language measures, mean length 
of response (MLR) and _ structural 
complexity score (SCS) in relation to 
the size of the language sample. Each 
sample was divided into 10 five-re- 
sponse segments and MLR and SCS 
were calculated for each segment. Re- 
liability estimates were obtained for 
the two language measures for varying 
numbers of response segments. 

The reliability analysis suggests that 
MLR scores based on 50 responses are 
of adequate reliability for most research 
purposes; however, the reliability of 
SCS values based on 50 responses may 
represent less precision than is desired 
in some situations. From the data pre- 
sented the number of response seg- 
ments necessary to achieve given levels 
of reliability can be estimated. 


Acknowledgments 


The authors express their apprecia- 
tion to Dr. Harris Winitz for the use 
of his language sample protocols, to 
Miss Sumiko Sasanuma for assistance 
in tabulation, and to Dr. Arnold M. 
Small for computer programming. 


References 


1. Davis, Epitra A., The development of 
linguistic skill in twins, singletons with 
siblings, and only children from age five 





Darley, Moll: Language Measure, Sample Size 173 


to ten years. Inst. Child Welf., Monogr. 
Ser., No. 14. Minneapolis: Univ. Minn. 
Press, 1937. 

. Day, Exzta J., The development of lan- 
guage in twins: I. A comparison of twins 
and single children. Child Develpm., 3, 
1932, 179-199. 

. Linnguist, E. F., Design and Analysis of 
Experiments in Psychology and Educa- 
tion. Boston: Houghton Mifflin, 1953. 

. McCartuy, Dorornea A., Language de- 
velopment in children. Chap. 9 in L. 
Carmichael (Ed.), Manual of Child Psy- 
chology. (2nd ed.) New York: Wiley, 
1954. 

. McCartuy, Dorotnea A., The language 
development of the preschool child. Inst. 
Child Welf., Monogr. Ser., No. 4. Min- 
neapolis: Univ. Minn. Press, 1930. 

. Nice, Marcaret M., Length of sentences 
as a criterion of a child’s progress in 
speech. J. educ. Psychol., 16, 1925, 370- 
379. 

. SCHNEIDERMAN, Norma, A study of the 
relationship between articulatory ability 
and language ability. J. Speech Hearing 
Dis., 20, 1955, 359-364. 


8. 


10. 


11. 


12. 


SmitH, Maporan E., An investigation of 
the development of the sentence and the 
extent of vocabulary in young children. 
Univ. la. Stud. Child Welf., 3, No. 5, 
1926. 


. SPRIESTERSBACH, D. C., Dartey, F. L., and 


Morais, H. L., Language skills in children 
with cleft palates. J. Speech Hearing Res., 
1, 1958, 279-285, 

Tempuin, Mitprep C., Certain language 
skills in children, their development and 
interrelationships. Inst. Child Welf., 
Monogr. Ser., No. 26. Minneapolis: Univ. 
Minn. Press, 1957. 

Warner, W. L., Meeker, Marcuia, and 
Eetts, K., Social Class in America (a man- 
ual of procedure for the measurement of 
social status). Chicago: Science Research 
Assoc., 1949. 

Wiuuams, H. M., An analytical study 
of language achievement in preschool 
children. Univ. Ia. Stud. Child Welf., 
13, No. 2, Part 1, 1937. 


. Winirz, H., Language skills of male and 


female kindergarten children. J. Speech 
Hearing Res., 2, 1959, 377-386. 





Articulatory Competency and 


Reading Readiness 


CARL H. WEAVER 


CATHERINE FURBEE 


RODNEY W. EVERHART 


It has been fairly well established that 
speech and reading are closely asso- 
ciated in the linguistic process involving 
symbolic formulation, evaluation, and 
expression. Artley (1) conceptualized 
speaking and reading as two sides of 
a square which represents language. 
The other two sides are writing and 
listening. He considered the four sides 
to be interdependent and after review- 
ing the literature concluded that there 
is a relationship between speech diffi- 
culties and deficiencies in reading, 
though there is no agreement as to the 
extent of the relationship. 

Research in the relation between 
reading achievement and defective 
speech began at least as early as 1931 
when Travis quoted Murray’s thesis 
to the effect that stuttering seemed 
to affect both silent and oral reading 





Carl H. Weaver (Ph.D., Ohio State Uni- 
versity, 1957) is Assistant Professor of 
Speech, Central Michigan University. Cath- 
erine Furbee (M.Ed., Pennsylvania State Uni- 
versity, 1943) is Director of Speech Cor- 
rection, Public Schools, Saginaw, Michigan. 
Rodney W. Everhart (Ph.D., University of 
Michigan, 1953) is Associate Professor of 
Speech and Director, Children’s Speech 
Clinic, Central Michigan University. 


Volume 3, No. 2 174 


(18, p. 166). In general, the reasearch 
has been conducted in two ways: (a) 
Groups of good and poor readers were 
given speech tests and the incidence of 
speech defects in each group compared 
(13. p. 92, 2, 11, 4); and (b) groups 
of children with and without speech 
defects were given reading tests and 
their scores compared (15, 14). When 
the former design was used, some signi- 
ficant differences in numbers of speech 


* defects were found between groups of 


good readers and groups of poor read- 
ers. Monroe (13, p. 92) found that 
27% of 415 children whose reading 
was defective also had speech defects, 
whereas only 8% of a control group 
had speech defects. Jackson (11) found 
that 23% of his 300 poor readers had 
defective speech compared to only 
10% of his 300 good readers. Bennett 
(2), using 50 pairs of children (a good 
reader paired with a poor reader) 
found that 9.5% of his poor readers 
had histories of stuttering which could 
be remembered by the children or 
their parents, while only 3% of his 
good readers had such histories. Using 
the second design, Moss (15) matched 


June 1960 








3 
P 
d 
C 
a 
I 
( 
I 
f 
j 


_— /_—_——— OH wee OF OZ LD 


a a ee ee 2S a a! | 6 


Weaver, Furbee, Everhart: Reading Readiness 175 


36 pairs of children, one child in each 
pair having normal speech and one 
defective speech. He administered the 
Gray Standardized Oral Reading Test 
and found significant differences in the 
rate of oral reading and the number 
of errors. Moore (14) found that 236 
ninth-grade children with speech de- 
fects were not, as a whole, deficient 
in reading ability, as measured by the 
Iowa Silent Reading Test. After re- 
viewing these and other _ reports, 
Artley (1) concluded that: 

Speech defects may be the cause or result 

of reading defects, or they may exist side 

by side as a result of some common factor. 

There is, however, some concomitant re- 

lationship. 

When speech defects are causal, the 
relationship may occur in these ways: 
(a) Bad speech habits may generalize 
to silent reading (6, p. 99); (b) the 
reader may concentrate on his concern 
about his speech and neglect the mean- 
ing; (c) speech defects may upset the 
rate and phrasing; (d) speech defects 
may result in errors of pronunciation 
and consequent misunderstanding of 
words (10, 1); (e) speech defects may 
cause dislike for reading and result in 
less practice than normal speakers get 
(10). 

There is some evidence that the dem- 
onstrated relationship between speech 
defects and inferior oral reading may 
extend to silent reading, but the evi- 
dence is inconclusive. 


All of this research has been con- 
cerned with the relation between 
speech defects and reading achieve- 
ment. Subjects used ranged from the 
second grade to the college level (8). 
All, however, had learned to read in 
some measure and the variable studied 
(dependent or independent) was read- 


ing ability. It is possible, however, that 
a common factor, possibly phonetic or 
semantic in nature, underlies both 
speech and reading. Hildreth (10) 
wrote, ‘Normally, a child’s first read- 
ing experiences are oral, and even in 
silent reading the persistence of inner 
speech suggests the close connection 
between reading and oral language.’ In 
search of this common factor, research 
workers have investigated intelligence, 
auditory acuity, auditory memory 
span, and speech sound discrimination 
(8, 12, 17, 13, 9). In general, results are 
inconclusive. Except for gross inade- 
quacies in these areas, none seems to 
have any important relationship to 
speech defects. 

Artley (1) and Hildreth (10) at- 
tached some importance to the 
readiness level of the child. Artley 
described as one aspect of the prob- 
lem ‘limited background for speech 
growing out of an absence of mean- 
ingful experiences with relation to both 
speech and reading improvement.’ It 
seems possible that readiness for both 
speech and reading may be related to 
this kind of experimental accumulation. 
The discovery of any such relationship 
would be of both theoretical and 
practical importance. 

In 1957, Betts (3, p. 316) proposed 
a theory of the sequential develop- 
ment of language which tends to 
parallel the hypothesis offered by Hil- 
dreth (10). In essence, Betts divided 
the sequence of language maturation 
in the child into six major areas. These 
encompass the development of visual 
perception relevant to objects, hear- 
ing comprehension involving discrim- 
ination between the speech sounds 
heard by the child, the production of 





176 Journal of Speech and Hearing Research 


meaningful speech configurations, read- 
ing based upon written or printed 
symbols, writing, and the general re- 
finement of language control. Accord- 
ing to this theory, facility in oral 
language could reflect directly upon, 
or be reflected by, a readiness for read- 
ing, although Betts did not suggest it. 
It would appear that some competency 
in the symbolizing and phonetic abili- 
ties underlies both capacity for speech 
and capacity for reading; and that the 
accumulated evidence that normal- 
speaking children are better readers 
than children with defective speech 
points to capacities which are com- 
mon to both speech and reading. 

The present investigation thus was 
designed to study the relationship be- 
tween articulatory competency and 
the reading readiness of children as the 
latter was measured by the Gates 
Reading Readiness Test. It was hypo- 
thesized that the skills and capacities 
measured by this test are related to the 
early acquisition of adequate speech. 


Procedure 


Speech tests and the Gates Reading 
Readiness Test were administered to 
638 first grade children enrolled in the 
Saginaw, Michigan, public schools. One 
section of grade one was selected for 
study in each school currently operat- 
ing as a regular component of the 
school system. A responsible admin- 
istrative official in consultation with 


the speech correction coordinator de-. 


termined which sections best represent- 
ed a typical sample of the area from 
which pupils were drawn. Care was 
exercised in choosing sections that were 
representative with regard to the var- 


iates of chronological age, intelligence, 
race, and socioeconomic level. Pupils 
possessing articulatory defects stem- 
ming from organic etiologies, stutterers, 
and mental defectives were excluded 
from the sample. 

A clinical evaluation of articulatory 
competency in both directed and spon- 
taneous speech was made by the speech 
correction coordinator for each child 
participating in this study. The evalu- 
ation involved (a) asking the child 
to label a series of pictures designed 
to elicit the different consonant sounds 
in the initial, medial, and final posi- 
tions; and (b) observing the pupil’s 
connected and continuous speech dur- 
ing a brief conversation. Articulatory 
deviations ordinarily present in the 
speech patterns of normal speaking in- 
dividuals and those attributable to 
dialect were not considered. Only dis- 
orders of articulation that could be 
classified as substitutions, omissions, 
distortions, or additions were recorded. 


. An error of position also was counted 


as one misarticulation. Thus it was 
possible for a child to exhibit 68 errors 
in the production of the so-called 
‘pure’ sounds. In addition, 30 blends 
were tested, making possible an artic- 
ulation error score of 98. A child 
whose speech demonstrated none of 
these errors was considered to possess 
‘normal’ speech. 

The reading readiness test was ad- 
ministered by the regular classroom 
teacher. Both the speech and the read- 
ing readiness tests were administered 
during the first four weeks of school. 
Percentile scores were obtained from 
the Gates tables for the subtests (Word 
Matching, Picture Directions, Word- 
Card Matching, Rhyming, and Letters 





for 
fro 





—_— wwe ae 


baad 


Weaver, Furbee, Everhart: Reading Readiness 177 


TaBLE 1. Mean reading readiness (RR) scores 
for groups of children whose speech demonstrated 
from 0 to 10 errors in articulation. 











Articulation Mean RR N 
Errors Scores 

0 72.7 163 

1 67.9 65 

2 67.9 57 

3 60.5 38 

4 61.5 31 

5 49.9 26 

6 53.0 17 

7 51.5 20 

8 55.3 19 

9 54.8 17 

10 54.0 18 








and Numbers) of the Gates Reading 
Readiness Test, and from these per- 
centile scores average percentiles were 
calculated and used as the total reading 
readiness scores. 


Results 


Of the 638 children, 163 were judg- 
ed to have normal speech. On the 
speech test, the remaining 475 exhibited 
from 1 to 78 out of a possible 98 artic- 
ulation errors. The mean number of 
errors was 10.7. The data for subjects 
whose speech was judged to demon- 
strate 0 to 10 errors and the mean 


reading readiness score for these groups 
may be seen in Table 1. A drop of 
about five percentiles in reading readi- 
ness can be seen for the group with 
one articulation error. Another sharp 
drop can be seen for the group with 
three speech errors. The almost steady 
decrease in reading readiness as the 
number of speech defects increased is 
apparent, with variations which can 
probably be attributed to the smaller 
numbers of children in each group as 
the number of errors increases. 

The data were arranged in two 
groups: for normal-speaking children 
and for children with one or more 
errors (Table 2, data for each group 
being presented separately by sex). The 
difference in reading readiness be- 
tween the normal and the nonnormal 
groups was evaluated by means of the 
median sign test (16, p. 314). The value 
of chi square (52.42, 1 df) is signifi- 
cant well beyond the .1% level. Both 
the high level of confidence and the 
size of the difference provide evidence 
to support the original hypothesis that 
articulatory skill and the variables 
measured by the reading readiness test 
are related. 

An examination of the data indicates 
that the mean reading readiness scores 
tended to decrease as articulation er- 
rors increased (Table 1); the skills and 


Taste 2. Mean reading readiness scores (in percentiles) and group differences for 163 normal-speaking 


and 475 nonnormal-speaking children. 











Group Normal Nonnormal Difference 
N Mean N Mean 
Male 77 71.14 257 57.14 14.00 
Female 86 74.14 218 57.16 16.98 
Total 163 72.72 475 57.15 15.57 








178 Journal of Speech and Hearing Research 


capacities measured by the Gates test 
thus appear to be inversely related to 
the number of articulatory errors. Ac- 
cordingly, the product-moment corre- 
lation coefficient’ for estimating the 
strength of the relationship between 
the articulation score and the reading 
readiness score was computed. The co- 
efficient was -.20, significant at the 
1% level. Coefficients computed for 
boys and girls separately were -.11 and 
-.10, both of which were just short of 
significance at the 5% level. The dif- 
ferences between the total group 
correlation and the within group cor- 
relations reflect the pattern which ex- 
ists among the means for the sex groups 
on the two variables. Thus much of 
the total group correlation may be 
attributed to the mean differences be- 
tween the boys and girls. Since the 
square of the coefficient for the total 
sample indicates only 4% common 
variance for the two sets of measures, 
the relationship is not strong. It seem- 
ed likely that the variables assessed 


by the Gates test might not all be re- 


lated to the early acquisition of speech 
and that other unmeasured capacities 
might facilitate early acquisition of 
speech. Gates (7) wrote that ‘the ca- 
pacity for listening to the first part of 
a story and then finishing it is more 
closely related to reading readiness than 
the capacities measured in the test. It 
was not possible to assess this factor 
with tlie present data, but an effort was 
made to determine whether all of the 
subtests were significantly related to 
adequate speech. 


Product-moment correlation coeffi- 


‘Regression was assumed to be linear; »*= 
07; F=.72 with 18 and 455 df (4). 


cients for estimating the relationship 
between the articulation scores and the 
percentile scores on the subtests and 
total test were as follows: Picture Di- 
rections, —.23; Word Matching, —.10; 
Word-Card Matching, —.15; Rhyming, 
—.15; Letter-Number Naming, —.17; 
and total test, —.20. All coefficients were 
significant at the 1% level except that 
for the second subtest, Word Match- 
ing, which was significant at the 5% 
level. All were of about the same 
magnitude as the coefficient for the 
total test score. At the 5% level, the 
confidence limits of any pair of coeffi- 
cients overlap. Thus it must be con- 
cluded that there is no evidence in these 
data that any one of the subtests is 
more closely related than the others 
to the capacity for learning speech. 

Finally, the. effect of age upon this 
relationship between the reading readi- 
ness measures and the articulation meas- 
ures was assessed. The sample was 
divided into a group of 231 subjects 
under six years and three months old 
and a group of 244 subjects at least six 
years and three months old. The mean 
reading readiness score for the older 
group was 60.24 and for the younger 
group 53.90. This difference was eval- 
uated by means of the median sign 
test. The value for chi square, correct- 
ed for continuity by the Yates form- 
ula, was 1.23, significant at about the 
25% level. 

The strength of the relationship be- 
tween articulation and reading readi- 
ness was compared for the two age- 
groups by a test of the difference be- 
tween the product-moment correla- 
tions. The coefficients were not signifi- 
cantly different from each other, —.18 
+.03 for the younger group and —.20 








int 
the 
en 
the 


qu 
Di 


an fp -. a ee uml CU C6 





i oC 


—— 


oe Se ee a |. ee ee ee 


uw Mm 





Weaver, Furbee, Everhart: Reading Readiness 179 


+.03 for the older group. These were 
interpreted to mean that in these data 
there was no evidence that age differ- 
ential had any important effect upon 
the relationship between speech ade- 
quacy and reading readiness. 


Discussion 


It was not possible from these data 
to conclude that the presence of an 
underlying variable causal to the ac- 
quisition of both adequate speech and 
reading had been either confirmed or 
denied. The large percentile differ- 
ences in reading readiness between 
children with normal and _ children 
with nonnormal speech confirmed the 
hypothesis that whatever is measured 
by the Gates test is related to early 
speech adequacy. Yet the low correla- 
tion coefficient suggested that the 
strength of the relationship was not 
great. The size of the correlations re- 
mained about the same for the total 
test score, the subtest scores, different 
age groups, and sex groups. All of the 
coefficients were significant but low, 
except two, which approached signifi- 
cance. 

It is possible that the reading readi- 
Ness test was tapping a part of an un- 
derlying capacity which, if better 
measured, would account for most of 
the covariance between articulatory 
skill and reading readiness. It seems 
more likely that the causes of early 
speech adequacy and reading readi- 
ness are multiple, and perhaps quite 
differential among subjects. 

Only a tentative explanation can be 
offered for the fact that the mean 
percentile on the Gates Reading Readi- 
ness Test was 61.13 for this sample of 


638 first grade children. It is possible 
that the population used in this study 
was significantly different from the 
population used by Gates for the stand- 
ardization of the test. The effect of 
this relatively high mean percentile in 
reading readiness upon the present 
treatment of the data is not known. 


Summary 


A speech articulation test and the 
Gates Reading Readiness Test were 
administered to 638 children in grade 
one during the first four weeks of 
school. On the basis of the data ob- 
tained, the following statements seem 
to be justified: 


Reading readiness and acquisition of 
adequate speech are to some extent re- 
lated, although the proportion of vari- 
ance common to reading readiness 
measures and articulation measures is 
quite small. It is possible that the Gates 
Reading Readiness Test measures part 
of an underlying variable causal to the 
acquisition of both reading and speech. 

The strength of the relationship be- 
tween reading readiness and articula- 
tory skill in grade one seems to be 
about the same for each of the Gates 
subtests and for two age groups, one 
younger than six years and three 
months and the other older. 


References 


1. Artiey, A. S., A study of certain factors 
presumed to be associated with reading 
and speech difficulties. J. Speech Hearing 
Dis., 13, 1948, 351-360. 

2. Bennett, C. C., An Inquiry into the 
Genesis of Poor Reading. No. 755, Teach- 
er’s College Contributions to Education. 
New York: Columbia Univ., 1938. 

3. Betrs, E. A., Foundations of Reading In- 
struction with Emphasis on Differenti- 





180 Journal of Speech and Hearing Research 


10. 


ated Guidance. New York: American 
Book, 1957. 


. Bryver, A., Considerations of the place 


of assumptions in correlational analysis. 
Amer. Psychol., 14, 1959, 504-510. 


. Bonn, G. L., The Auditory and Speech 


Characteristics of Poor Readers. No. 
657, Teacher’s College Contributions to 
Education. New York: Columbia Univ., 
1935. 


. Gates, A. I., The Improvement of Read- 


ing; A Program of Diagnostic and 
Remedial Methods. (3rd ed.) New York: 
Macmillan, 1947. 


. Gates, A. I., Manual of Directions for 


Gates Reading Readiness Tests. New 
York: Columbia Univ., 1942. 


. Harr, Marcaret E., Auditory factors in 


functional articulatory speech defects. J. 
exp. Educ., 7, 1938, 110-132. 


. Henry, Sipyz, Children’s audiograms in 


relation to reading attainments: III. Dis- 
cussion, summary, and conclusions. J. 
genet. Psychol., 71, 1947, 49-63. 
Hipretu, Gertrupe, Interrelationships 
among the language arts. Elem. School J., 
48, 1948, 538-549. 


. Jackson, J., A survey of psychological, 


social, and environmental differences be- 


13. 


14. 


16. 


17. 


tween advanced and retarded readers. 
J. genet. Psychol., 65, 1944, 113-131. 


. Mase, D. J., Etiology of Articulatory 


Speech Defects. No. 921, Teacher’s Col- 
lege Contributions to Education. New 
York: Columbia Univ., 1946. 

Monroe, M., Children Who Cannot 
Read; The Analysis of Reading Disa- 
bilities and the Use of Diagnostic Tests 
in the Instruction of Retarded Readers. 
Chicago: Univ. Chicago Press, 1932. 
Moors, C. E. A., Reading and arithme- 
tic abilities associated with speech de- 
fects. J. Speech Dis., 12, 1947, 85-86. 


. Moss, Marcery Anne, The effect of 


speech defects on second grade reading 
achievement. Quart. J. Speech, 24, 1938, 
642-654. 

Mosretter, F., and Busu, R. R., Selected 
quantitative techniques. In Gardner Lind- 
zey, Handbook of Social Psychology. 
Cambridge: Addison-Wesley, 1954. 
Rosinson, Heten, Why Pupils Fail in 
Reading. Chicago: Univ. Chicago Press, 
1946. 


. Travis, L. E., Speech Pathology; A Dy- 
namic Neurological Treatment of Nor- 
mal Speech and Speech Deviations. New 
York: D. Appleton, 1931. 








Extensional Definition and Attitude toward Stuttering 


LONNIE L. EMERICK 


Stutterers report that the amount of 
stuttering they do varies from situation 
to situation’ (2). The listener’s percep- 
tion of stuttering also tends to vary 
from situation to situation (3, 17). In 
other words, the problem of stuttering, 
especially at the time of onset, repre- 
sents a kind of relationship between 
speaker and listener rather than a con- 
stant condition of either speaker or 
listener (7, chap. 28). 

Two empirical studies relevant here 
are those by Boehmler (3) and Tuthill 
(17). It was demonstrated in both of 
these that persons acquainted with stut- 
tering counted significantly more stut- 
terings than did lay judges in reacting 
to the same speech samples. Compar- 
able behavior on the part of listeners 
was noted in a nasality study by Webb 
(18) who found that trained and ex- 
perienced speech pathologists gave sig- 
nificantly more severe ratings than did 
naive judges. 

A generalization that might be drawn 
from the available data is that incre- 
ments of training in speech pathology 
appear to be associated with a ten- 
dency to judge abnormal speech more 
severely. With specific reference to 
stuttering, judges trained in speech 





Lonnie L. Emerick (M.A., Michigan State 
University, 1959) is Graduate Fellow in 
Speech, Michigan State University. This ar- 
ticle is based on an M.A. thesis completed 
under the direction of Dr. Ralph R. Leuten- 
egger. 


Volume 3, No. 2 


181 


pathology tend to count more moments 
of stuttering than do lay judges. This 
observed difference in extensional defi- 
nition of stuttering is probably due to 
the reaction sensitivity of speech path- 
ologists regarding speech breakdowns, 
a reaction which has been described 
(4, p. 66) as ‘a selective readiness to 
respond to certain components of a 
situation and not others, which is a 
result of one’s having acquired a sys- 
tem of related attitudes and responses.’ 

Speech pathology training also seems 
to be associated with a more tolerant 
attitude toward stuttering.” Ammons 
and Johnson (1) found, in the develop- 
ment of the Iowa Scale of Attitude To- 
ward Stuttering, that: 


The mean scores were found to differ- 
entiate the groups, with clinicians show- 
ing the least unfavorable reaction to 
stuttering, freshmen and stutterers a mod- 
erate reaction, and townspeople the most 
unfavorable reaction. 

Although speech pathologists count 
more stutterings in a given sample of 
speech than do lay judges, speech path- 
ologists have better, that is, more tol- 
erant, attitudes toward stuttering than 
do lay individuals. It would appear, 
therefore, that attitude toward stutter- 
ing is related to extensional definition 
(countings) of stuttering and, further, 
that a group of judges with ‘good’ at- 
titudes toward stuttering will count 
more stutterings in a given passage of 
speech than will another group of 


June 1960 





182 Journal of Speech and Hearing Research 


judges with ‘poor’ attitudes toward 
stuttering. 


An individual’s perception of objects 
and events seems related to his attitude 
toward the objects and events. Re- 
search (6, 10) concerned with percep- 
tual defense has demonstrated the 
phenomena of perceptual filtering, that 
is, the differential recognition rate of 
pleasant, neutral, and unpleasant stim- 
uli; unpleasant stimuli are not recog- 
nized as rapidly as neutral or pleasant 
stimuli. Listeners not trained in speech 
pathology tend to react to stuttering as 
a noxious stimulus; there is a depression 
of listener behavior, manifested in re- 
duced bodily activity and conversation 


(13). 


It has been demonstrated that nega- 
tive attitudes toward ethnic and racial 
groups can be significantly altered by 
acquired knowledge concerning the 
groups so evaluated (11, pp. 20-22, 12, 
15, 16). Similarly, it would appear, 
when an individual receives informa- 
tion concerning stuttering, a change in 
the positive direction would be ex- 
pected in the individual’s attitude to- 
ward stuttering. Speech pathology 
training would tend not only to sen- 
sitize one to certain abnormal speech 
phenomena, but also to delimit whatever 
negative evaluations an individual might 
have had prior to such training. Such 
a relationship between extensional defi- 
nition of stuttering and attitude toward 
stuttering is supported in the Darley 
(5, chap. 4) study of parental attitudes. 
The mothers in Darley’s experimental 
group (mothers of stutterers) rated 
their children’s speech more severely 
than did the fathers; on the attitude 
test, however, the fathers of the stut- 
tering children, as a group, appeared 


to have a less tolerant attitude toward 
stuttering than did the mothers. 


The specific null hypothesis tested in 
this study is that there is no relation- 
ship between the variables of number 
of counted stutterings (extensional 
definition) and a selected measure of 
attitude toward stuttering. In other 
words, the main question is concerned 
with whether a person’s attitude to- 
ward stuttering may influence his count 
of the number of stutterings. A minor 
purpose is the evaluation of the pos- 
sible effect upon attitude and upon 
countings of stuttering of such vari- 
ables as marital status, parental status, 
age of subjects’ children, grade level 
taught, and career plans. 


Procedure 


Subjects were 21 male and 127 fe- 
male elementary school teachers 
(grades one through six). Elementary 
school teachers were used because they 
are exposed to a wide range of chil- 

. dren’s speech behavior and also because 

a large majority consider themselves 
inadequately trained to deal with the 
problem of stuttering (9, 8). 

The Iowa Scale of Attitude Toward 
Stuttering (1) was selected to measure 
the subjects’ attitudes toward stutter- 
ing. The author has certain reservations 
concerning the precision of this scale. 

“@The scale appears to measure a sub- 
ject’s knowledge about stuttering rath- 
er than his attitude toward it. 


A 3.5-minute tape recording of the 
oral reading of an individual who con- 
siders himself a stutterer served as the 
stimulus material for all subjects for 
the counting of stutterings. The order 
of presentation of the tape recording 
and the attitude scale was varied, with 








ee) Ser, re ee 


Emerick: Extensional Definition, Attitude on Stuttering 183 


TaBLE 1. Distribution of obtained and expected 
(in parentheses) frequencies of measures repre- 
senting extensional definition of stuttering for 
all subjects according to attitude category and 
results of the chi-square test. 











Range of Frequency Frequency Total 

Measures above Median below Median 

1.00-1.49 41 9 50 
(25) (25) 

1.50-1.99 24 31 55 
(27.5) (27.5) 

2.00-2.49 9 34 43 
(21.5) (21.5) 

Total 74 74 148 


Chi square (df = 2) =35.88; value of 5.99 
(df = 2) significant at 5% level 








68 subjects reacting to the tape record- 
ing first, the remaining 80 subjects re- 
acting to the attitude scale first. The 
median test (14) was employed as the 
statistical test and the 5% level re- 
quired for significance. 


Results 


The range of the subjects’ countings 
of stutterings extended from a low of 
eight to a high of 87 with a median of 


TaBLE 2. Distribution of obtained and expected 
(in parentheses) frequencies of measures repre- 
senting extensional definition of stuttering by 
naive and sophisticate groups of subjects and 
results of the chi-square test. 








Subject Frequency Frequency Total 
Group above Median below Median 





Sophisticate 33 41 74 
(23.5) (50.5) 

Naive 14 60 74 
(50.5) 

Total 47 101 148 


Chi square (df = 1) = 10.30; value of 3.84 
(df = 1) significant at the 5% level 








30. According to their counts of stut- 
terings, the subjects were divided into 
two groups: those subjects above the 
median score for the combined groups 
and those subjects below the median. 
Those subjects whose counts of stut- 
terings were at the median were as- 
signed randomly and equally above and 
below the median. 


The range of the subjects’ attitude 
scores extended from a low of 1.04 to 
a high of 2.47. The attitude scale re- 
sponses were grouped into three cate- 
gories: 1.00 to 1.49 representing ‘good’ 
attitudes; 1.50 to 1.99 representing 
‘moderate’ attitudes; 2.00 to 2.49 rep- 
resenting ‘poor’ attitudes. 


Order of Presentation. The subjects 
were divided into two groups accord- 
ing to the order in which they reacted 
to the experimental materials. Statistical 
analysis of the countings of stutterings 
failed to yield a significant difference 
between the two groups which counted 
stutterings for the first condition and 
for the second condition, respectively 
(chi square = .84, df = 1). 


Countings of Stuttering and Attitude 
toward Stuttering. The data relating to 
countings of stuttering are summarized 
in Table 1. The obtained chi-square 
value (35.88 with 2 df) is significant, 
which, along with an examination of 
the data reported in Table 2, indicates 
a relationship between attitude and 
countings of stuttering. Statistical anal- 
ysis of the scores on attitudes toward 
stuttering also failed to provide evi- 
dence of a significant difference be- 
tween the two groups (chi square = 
1.21, df = 2). Information reported in 
Table 1 indicates that the lack of in- 
dependence between attitude and 
countings of stuttering is because of 





184 Journal of Speech. and Hearing Research 


TaBLE 3. Distribution of obtained and expected 
(in parentheses) frequencies of measures repre- 
senting attitude toward stuttering by naive and 
sophisticate groups of subjects and results of the 
chi-square test. 








Range of Naive Sophisticate Total 





Measures Group Group 

1.00-1.49 20 30 50 
(34) (16) 

1.50-1.99 47 8 55 
(38) _ (17) 

2.00-2.49 34 9 43 
(29) (14) 

Total 101 47 148 


Chi square (df = 2) = 27.28; value of 5.99 
(df = 2) significant at 5% level 








the trend for higher counts to accom- 
pany ‘better’ attitudes. 


Naive and Sophisticate. Of the total 
number of 148 subjects, 47 reported 
that they had taken one or more for- 
mal college course in speech pathology; 
these subjects were designated ‘sophis- 
ticate.’ The remaining 101 subjects re- 
ported that they had not taken formal 
college course work in speech pathol- 
ogy; these subjects were designated 
‘naive.’ A comparison between these 
two groups with reference to the cat- 
egorization above and below the 
median count shows that sophisticate 
subjects counted more stuttering than 
did naive subjects (see Table 2). The 
obtained chi-square value (10.30 with 
1 df) indicates a significant lack of in- 
dependence. 

A comparison between these two 
groups with reference to their responses 
to the attitude scale shows that sophis- 
ticate subjects manifested a better, that 
is, more tolerant, attitude toward stut- 
tering than did the naive subjects (see 
Table 3). The obtained chi-square 


value (27.28 with 2 df) indicates a sig- 
nificant lack of independence. 

Since this study represents an initial 
exploration of the relationships between 
countings of stutterings and attitude 
toward stuttering, the possible effects 
of certain independent variables upon 
the dependent variables of measures of 
attitude and countings of stutterings 
were evaluated. Statistical analyses pro- 
vided no evidence that the variables of 
subjects’ marital status, parental status 
(‘did have children’ and ‘did not have 
children’), age of subjects’ children, 
grade level taught, or career plans are 
dependent upon the variables of meas- 
ures of attitude or countings of stutter- 
ings. 


Discussion 


The results of this study indicate a 
relationship between countings of stut- 
terings and attitude toward stuttering. 
In other words, a person’s attitude to- 
ward stutterings tends to influence his 
counting of stutterings. That is, sub- 
jects manifesting ‘good’ (tolerant) at- 
titudes toward stuttering tended to 
count more ‘moments of stuttering’ in 
a given sample of speech than did sub- 
jects manifesting ‘poor’ (intolerant) at- 
titudes toward stuttering. 

Although the relationship between 
countings of stutterings and attitude to- 
ward stuttering is not necessarily a 
causal one, at least three interpretations 
of the significant results are possible. 
(a) The relationship between identi- 


_ fication of stutterings and ‘good’ (tol- 


erant) attitude toward stuttering may 
indicate that training in speech pathol- 
ogy acquaints a listener with the ‘right’ 
answers on the attitude scale. Training 
in speech pathology may also tend to 





nc 
la 
ar 
lis 





Emerick: Extensional Definition, Attitude on Stuttering 185 


make a listener more likely to classify 
nonfluencies as stuttering. (b) The re- 
lationship may indicate that acquaint- 
ance with stuttering tends to increase a 
listener’s acceptance of stuttering. As 
demonstrated in this study, subjects 
who had taken formal college course 
work in speech correction tended to 
have more tolerant attitudes toward 
stuttering than did subjects who had 
not taken such formal college course 
work. (c) The relationship may indi- 
cate that training in speech pathology 
tends not only to make a person more 
tolerant of stuttering, but it may tend 
also to make him less tolerant of speech 
nonfluencies. The result in this latter 
instance may be a greater inclination 
to classify nonfluencies as abnormal. 


Summary 


The purpose of this study was to 
test the relationship between attitude 
toward stuttering and countings of 
stutterings. Subjects, 148 elementary 
school teachers, counted stutterings on 
a tape-recorded passage of speech and 
reacted to a test of attitude toward 
stuttering. On the basis of the findings 
of this study, within the limitations of 
the experimental conditions and recog- 
nizing the questionable validity and 
reliability of the attitude scale, the fol- 
lowing conclusions are drawn: (a) 
More tolerant attitudes toward stutter- 
ing are accompanied by higher count- 
ings of stutterings. (b) Training in 
speech correction has an ameliorative 
effect upon attitude toward stuttering. 
Such training tends also to sensitize a 
listener to moments of stuttering. (c) 
Attitude toward stuttering and count- 
ings of moments of stuttering are not 
dependent in any important way upon 


marital status, parental status, age of 
the subjects’ children, grade level 
taught, and career plans. 


References 


1. Ammons, R., and JoHNson, W., Studies 
in the psychology of stuttering: XVIII. 
The construction and application of a 
test of attitude toward stuttering. J. 
Speech Dis., 9, 1944, 39-49. 


2. Buoonstein, O., A rating scale study. of 
conditions under which stuttering is re- 
duced. J. Speech Hearing Dis., 15, 1950, 
29-36. 


3. Borumier, R., A quantitative study of 
the extensional definition of stuttering 
with special reference to the audible 
designata. Ph.D. dissertation, State Univ. 
Iowa, 1953. 


4. Cameron, N., The Psychology of Be- 
havior Disorders, a Biosocial Interpreta- 
tion. Boston: Houghton Mifflin, 1947. 


5. Dartey, F., The relationship of parental 
attitudes and adjustments to the develop- 
ment of stuttering. In W. Johnson (Ed.), 
Stuttering in Children and Adults. Min- 
neapolis: Univ. Minnesota Press, 1955. 


6. Graupin, V., The speed of visual recog- 
nition of primitive, social, and neutral 
words by criminal psychopathic, criminal 
normal, and hospitalized psychoneurotic 
subjects. Ph.D. dissertation, Univ. Illi- 
nois, 1954. 


7. Jounson, W., Perceptual and evaluative 
factors in stuttering. In L. Travis (Ed.), 
Handbook of Speech Pathology. New 
York: Appleton-Century-Crofts, 1957. 


8. Knupson, THetma, A study of the oral 
recitation problems of stutterers. J. 
Speech Dis., 4, 1939, 235-239. 


9. Lioyp, GretcHEN W., and Ainsworth, S., 
The classroom teacher’s activities and at- 
titudes relating to speech correction. J. 
Speech Hearing Dis., 19, 1954, 244-249. 


McGinnits, E., Emotionality and° per- 
ceptual defense. Psychol. Rev., 56, 1949, 
244-251. 


11. Remmers, H. H., An experiment on the 
retention of attitudes as changed by in- 
structional materials, Further Studies in 
Attitudes, Series Ill, in Studies in Higher 
Education XXXIV. Lafayette, Ind.: Pur- 
due Univ. Press, 1938. 


10. 


ad 








186 Journal of Speech and Hearing Research 


12. 


13. 


14. 


15. 


Rose, A. M., Studies in the Reduction of 
Propaganda. Chicago: Am. Coun. Race 
Rel., 1947. 

Rosenserc, S., and Curtiss, J., The ef- 
fect of stuttering on the behavior of the 
listener. J. abnorm. soc. Psychol., 49, 
1954, 355-361. 

Stecet, S., Nonparametric Statistics for 
the Behavioral Sciences. New York: Mc- 
Graw-Hill, 1956. 

Smitu, F. T., dn Experiment in Modify- 
ing Attitudes Toward the Negro. New 


16. 


17, 


York: Teach. Coll., 
1943. 

Smit, M., A study of change of at- 
titude toward the Negro. J. Negro 
Educ., 8, 1939, 64-70. 

Tutu, C. E., A quantitative study of 
extensional meaning with special refer- 
ence to stuttering. Speech Monogr., 13, 
1946, 81-98. 


Columbia Univ., 


. Wess, C., Selected variables in .nasality 


judgment. Ph.D. dissertation, Pennsyl- 


vania State Univ., 1957. 





Infant Speech: Effect of Systematic Reading of Stories 


ORVIS C. IRWIN 


In a previous study (4) it was found 
that differences in the speech sound 
status of two groups of infants exist 
when they are categorized according 
to the occupational level of their fath- 
ers. The fathers of one group of babies 
were in business and the professions, the 
fathers of the other group were skilled, 
semiskilled, or unskilled workers. The 
influence of occupational level was 
found to be negligible during the first 
year and a half, but during the period 
from about 18 months to 30 months 
the speech sound superiority of the 
former group of infants over the latter 
was statistically significant. 

The present study was designed to 
test the hypothesis that in the homes 
of working families systematic read- 
ing of stories to infants during the 
year-and-a-half period between the 
ages of 13 and 30 months will increase 
the amount of their phonetic produc- 
tion. 


Subjects 


Two groups of infants were selected 
from families whose fathers were en- 
gaged in occupations which fall into 
the following categories: day laborer, 
truck driver, fireman, policeman, me- 





Orvis C. Irwin (Ph.D., Ohio State Univer- 
sity, 1929) is Research Professor of Logo- 
pedics, Institute of Logopedics, Wichita. This 
study was made while he was Professor of 
ou. te Iowa Child Welfare Research 
Station, University of Iowa. 


Volume 3, No. 2 


187 


chanic, delivery man, electrician, print- 
er, ambulance driver, nurseryman, 
tavern keeper, carpenter, barber, tent- 
maker, and butcher. The experimental 
group included 24 infants, the control 
group 10. All of the children were con- 
sidered physically normal; all were 
from Iowa City homes which, with 
only a few exceptions, were monolin- 


gual. 
Method 


Mothers of the 24 infants in the ex- 
perimental group were instructed to 
spend 15 or 20 minutes each day read- 
ing stories to their children from 
illustrated children’s story books,’ 
pointing out the pictures, talking about 
them, making up original, simple tales 
about, them, and in general furnishing 
materials supplemental to the text so 
that the speech sound environment im- 
pinging upon the children would be 
enriched. In order to assure that the 
regimen was carried out by the parents, 
frequent consultations were held with 
them. Two or three books were 
brought into each of the homes during 
each two-month period beginning 
when the child was 13 months of age 


and continuing until he was 30 months 
old. 





*The Little Golden Books, published by 
Simon and Schuster, New York City, were 
used in this study. 


June 1960 








188 Journal of Speech,and Hearing Research 


TaBLe 1. Comparison of mean phoneme frequency scores of two groups of infants at two-month inter- 
vals during study of effect of systematic reading of stories on their speech sound production. The ex- 
perimental group (E) of 24 children was under a regimen of enriched reading; the control group (C) 


of 10 children was not. 











Age Level Group Mean SD Diff. i* Sign. 
(months) Level 
13-14 ae -2.5 
15-16 ; as -1.7 
17-18 ; ae ch 6.2 2.21 02 
19-20 ; wes ts 9.7 3.62 01 
21-22 ; a ea 12.0 1.77 10 
23-24 ae rb 16.4 2.13 .05 
25-26 ; ane ve 25.4 2.73 01 
27-28 ye pi 36.7 4.03 -005 
29-30 " = a 31.1 4.26 .005 








* One-tailed test. Beginning with the 21-22-month level a modified ¢ test which allows for unequal 


variances was used. 


Books were not furnished to the par- 
ents of the 10 children in the control 
group and no reading regimen was 
prescribed for the group. This of 
course does not mean that the control 
group children did not receive the 
customary stimulation characteristic of 
these homes. 


The children of both groups were. 


regularly paid an afternoon visit dur- 
ing each two-month period and their 
spontaneous speech was recorded by 
paper and pencil in the international 
phonetic alphabet rather than by a tape 


recorder. As a rule one parent was 
present. No effort was made to stimu- 
late the child’s vocalization. It has been 
demonstrated (3, 6) that infants’ artic- 
ulation may be measured in terms of 
phoneme type or frequency. A pho- 
neme type is one of the individual 
sounds listed in the international pho- 
netic alphabet. Phoneme frequency is 
defined as the total number of times 
a particular type occurs in a given 
speech sample. An infant who vocalizes 
a single phoneme would receive a type 
count of one and a frequency count 





Irwin: Reading Regimen and Infant Speech 






EXPERIMENTAL 


FREQUENCY 
3 


$ 
rit hot ar eee ee ee 


-—. 


s 
i i | ! 1 j ! | 1 








13-14 15-16 17-8 19°20 BI-ke 23-24 89-26 27-28 B9-30 MONTHS 


Figure 1. Graphic presentation of mean 
phoneme frequency scores of two groups of 
young children. Children in experimental 
roup were under a regimen of enriched 
reading. Children in control group were not. 


equal to the number of productions of 
that type. The present report is con- 
cerned with the amount of vocaliza- 
tion as measured by the total phoneme 
frequency of all types. It has been 
found (5) that a satisfactory unit for 
observing phoneme frequency is one 
breath. The reliability of the observer 
who recorded the sounds live by paper 
and pencil in the international phonetic 
alphabet has been demonstrated and re- 
ported elsewhere (2, 4). The vocaliza- 
tions of sounds on 30 breaths con- 
stituted the sample taken at each visit. 
The phoneme frequency score for each 
child at each age was his total score at 
that age. 


Results 


Data were grouped into two-month 
age levels for analysis. Mean phoneme 
frequency scores (Table 1 and Figure 
1) show that from the thirteenth until 
about the seventeenth month there is 
little difference between the experi- 
mental and control groups. Soon after 
the seventeenth month the curves for 


189 


the two groups separate and thereafter 
the means of the experimental group 
consistently exceed those of the control 
group until the age of two and a half 
years, the age at which the experiment 
was terminated. 


Table 1 gives the phoneme fre- 
quency scores for the two groups and 
the significance of the differences be- 
tween them. Except for the first two 
age levels all differences are in favor 
of the experimental’ group. (The dif- 
ferences at the first two age levels were 
not tested.) A one-tailed ¢ test was ap- 
plied. It will be noted that for the 
experimental group not only the means 
but also the variances increase with in- 
crease in age, while the variances for 
the control group remain about the 
same. The effect is to render the vari- 
ances for the two groups unequal be- 
ginning with and after the 21-22-month 
level. For evaluation of differences be- 
ginning with the 21-22-month level it 
was thus necessary to use a modified t 
test (J, p. 454) which can be applied 
regardless of the sizes of the variances. 
All differences are significant at or be- 
yond the 5% level except that of the 
21-22-month age. The differences be- 
tween the experimental and control 
groups increase markedly after this pe- 
riod. 


The results of this study suggest that 
systematically increasing the speech 
sound stimulation of infants under two 
and a half years of age in homes of 
lower occupational status by reading 
and by telling stories about pictures 
will lead to an increase in the phonetic 
production of these infants over what 
might be expected without reading en- 
richment. 





190 Journal of Speech and Hearing Research 


Summary 


This study was designed to test the 
effect which systematic reading of 
stories would have on phonetic pro- 
duction of very young children. Sub- 
jects were 34 children; the experiment 
began in their thirteenth month and 
ended in their thirtieth month. During 
this period books were furnished 
weekly and a regimen of reading was 
prescribed for the children in the ex- 
perimental group (N = 24), but not 
for the children in the control group 
(N = 10). Spontaneous vocalization of 
each of the 34 children was recorded 
by paper and pencil in the international 
phonetic alphabet in home visits dur- 
ing each two-month period throughout 
the experiment. 


Little difference was found between. 


the groups in the mean scores for 
phoneme frequency until about the 
seventeenth month; from then on the 
difference increased consistently with 


the experimental group having higher 
scores than the control group. 


References 


1. Cocuran, W. G., and Cox, Gertrupe M., 
Experimental Designs. New York: Wi- 
ley, 1950. 

2. Irwin, O. C., Correct status of a third 
set of consonants in the speech of cere- 
bral palsy children. Cerebr. Pal. Rev., 
18, 1957, 3, 17-20. 

. Irwin, O. C. Development of speech 
during infancy: curve of phonemic fre- 
quencies. J. exp. Psychol., 37, 1947, 187- 
193. 

4. Irwin, O. C., Infant speech: the effect of 
family occupational status and of age on 
sound frequency. J. Speech Hearing Dis., 

13, 1948, 320-323. 

5. Irwin, O. C., Reliability of infant speech 
sound data. J. Speech Hearing Dis., 10, 
1945, 227-235. 

6. Irwin, O. C., and Cuen, H. P., Develop- 
ment of speech during infancy: curve 
of phonemic types. J. exp. Psychol., 36, 
1946, 431-436. 

7. Irwin, O. C., and Curry, J., Vowel ele- 
ments in the crying vocalization of in- 
fants under ten days of age. Child De- 
velpm., 12, 1941, 99-109. 


we 





ee a), ee en ol 





Several Procedures for Scaling Articulation 


DOROTHY SHERMAN 


WALTER L. CULLINAN 


The present study is concerned with 
the psychological scaling of defective 
articulation. The scaling method under 
consideration is that of equal-appearing 
intervals. Reliable scale values of de- 
fectiveness of articulation for short seg- 
ments of speech, five seconds or 10 
seconds long, have been obtained by this 
method (2, 4). 


Reliable scale values have been 
obtained also for one-minute tape-re- 
corded speech samples (5). These 
samples were divided into short seg- 
ments, and the segments were then 
randomized so that no two segments 
from any one speech sample were 
placed immediately adjacent to each 
other. Mean scale values were calcu- 
lated for each speech sample from re- 
sponses of each of a number of single 
observers, employing a _ nine-point 
equal-appearing-intervals scale. The ob- 
servers had been trained with previous- 
ly constructed tape-recorded severity 
scales (2). Satisfactory reliability of 
mean scale values based upon single 
observer ratings was established. 





Dorothy Sherman (Ph.D., University of 
Towa, 1951) is Associate Professor of Speech 
Pathology and Audiology, University of 
Iowa. Walter L. Cullinan (M.A., University 
of Iowa, 1960) is Audiology Fellow, Univer- 
sity of Iowa. 


Volume 3, No. 2 


191 


The question arises as to whether 
reliable mean scale values of defective- 
ness of articulation can be obtained 
from consecutive responses of a single 
observer at short intervals throughout 
each speech sample, that is, without 
employing the randomizing procedure 
described above. Sherman (3) found 
the consecutive-interval method useful 
in scaling the severity of stuttering. 
Sherman and Morrison (5), however, 
question the independence of judgments 
of articulation defectiveness at consecu- 
tive intervals during one speech sample 
because the severity of articulation de- 
fectiveness is probably less variable 
from one consecutive short speech seg- 
ment to another than is severity of 
stuttering. Further questions arise as to 
(a) whether single ratings of severity 
of defectiveness of articulation for one- 
minute speech samples as a whole will 
give reliable scale values and (b) 
whether scale values so obtained will 
agree with mean values obtained on 
the basis of the consecutive 10-second 
ratings. 

Another question arises as to whether 
any part of a one-minute speech sample 
has more influence than any other part 
on the over-all rating of severity of ar- 
ticulation defectiveness. The question 
might be asked, for example, as to 


June 1960 





192 Journal of Speech and Hearing Research 


whether the rater is influenced more 
by the first 10 seconds or by the last 
10 seconds of the sample. 


The main purposes of this study, 
then, are (a) to evaluate the reliability 
of mean scale values of articulation de- 
fectiveness based upon single observer 
consecutive ratings at 10-second inter- 
vals during one-minute speech samples, 
(b) to evaluate reliability of individual 
observer ratings of articulation defec- 
tiveness based upon single ratings of 
one-minute speech samples, and (c) to 
evaluate how well the scale values ob- 
tained by the above two methods agree 
with each other and with the scale 
values obtained by the randomization- 
of-segments method (5). A minor pur- 
pose is the evaluation of the relation- 
ships between single ratings of one- 
minute samples and any one of the six 
10-second segments, most particularly 
the beginning and ending segments. 


Procedure 


Stimulus Material. The stimulus ma- : 


terial consisted of 50 one-minute tape 
recordings of continuous speech of chil- 
dren between the ages of five and 10 
years. These 50 recordings represent 
a range of articulation ability from 
normal to severely defective. They 
were tape-recorded speech samples 
which had previously been scaled for 
defectiveness of articulation as reported 
in a study by Sherman and Morrison 
(5). 

The 50 recorded samples were copied 
twice with about five seconds between 
samples for a recorded announcement 
of the number of each sample. On one 
of the copies, the word rate was re- 
corded at 10-second intervals during 
each speech sample to provide the sig- 


nals for listeners to record their ratings, 
six for each one-minute sample. 

The original recordings were made 
by Morrison on a Magnecord tape re- 
corder, Model PT6-V, with an Altec, 
Model 21C, condenser microphone, and 
with a tape speed of 15 inches per sec- 
ond. The copies for the present study 
were re-recorded from the original tape 
on a Presto tape recorder, Model RC 
10/24, at a tape speed of 15 inches per 
second. 


Training and Practice Material. 
Tape-recorded severity scales of artic- 
ulation defectiveness were available 
from the Sherman and Morrison (5) 
study. The scales consisted of four sets 
of nine 10-second segments each, with 
each set or scale having one segment 
at each of nine levels of severity rang- 
ing from one for least defective articu- 
lation to nine for most defective 
articulation. Another tape consisted of 
the 36 segments of the four severity 
scales arranged in random order with 
respect to severity of articulation de- 
fectiveness. This tape was used for 
practice rating during the training ses- 
sion. 


Scaling Method. The scaling method 
of equal-appearing intervals was se- 
lected on the basis of the results of 
Morrison (2) and Sherman and Moodie 
(4). Morrison obtained reliable scale 
values of severity of articulation de- 
fectiveness for segments five seconds 
long and for segments 10 seconds long 
by the use of this method. Sherman and 
Moodie found this method more useful 
for scaling defectiveness of articulation 
of five-second speech segments than 
the methods of successive intervals, pair 
comparisons, or constant sums. The 
choice of 10 seconds rather than five 





sec 





Sherman, Cullinan: Scaling Articulation 


seconds as the length of the segments 
to be employed in the present study 
was an arbitrary choice except for the 
purpose of reducing the tasks of com- 
putation and of tape preparation. 


Listening Conditions. The listening 
sessions were held in a sound-treated 
room which contained the playback 
equipment. The equipment consisted 
of an Ampex, Model 350, tape play- 
back; a McIntosh amplifier, Model 
20W2; and a Jensen Duax loudspeaker. 


Observers. The observers who rated 
the samples at consecutive 10-second 
intervals were 14 graduate students in 
speech pathology, all of whom were 
trained and experienced in clinical 
evaluation of articulation defects. The 
observers who rated each sample as a 
whole were 15 students, mainly under- 
graduates, enrolled in a voice and ar- 
ticulation disorders course, all of whom 
had had some training and clinical ex- 
perience in diagnosing and appraising 
articulation defects. 

Rating at Consecutive Intervals. A 
training session preceded the experi- 
mental session for rating the samples at 
consecutive intervals. The four previ- 
ously constructed severity scales were 
played for the observers, each scale 
consisting of nine 10-second segments, 
with each set having one segment at 
each of nine levels of severity ranging 
from one for least defective articula- 
tion to mine for most defective articula- 
tion. The instructions for rating the 
speech segments were then given. The 
observers were told to rate each seg- 
ment as a whole and to avoid, insofar 
as possible, being influenced by any 
factor other than articulation. The 36 
segments of the four severity scales 
were then presented in random order 


193 


and rated by the observers on a nine- 
point scale by the method of equal- 
appearing intervals. These segments 
were then played back; the previously 
established level of severity was an- 
nounced by the experimenter so that 
the observers could compare their rat- 
ings with the established levels. The 
entire procedure was then repeated. 

Following the training session, the 
instructions for the experimental judg- 
ing were read to the observers. The 
first five samples were used for prac- 
tice and were repeated at the end of 
the judging for the purpose of complet- 
ing the experimental task. The training 
and the experimental judging session 
each required approximately one hour 
and 15 minutes, or a total of about two 
hours and 30 minutes. There were short 
rest periods between the training ses- 
sion and the experimental session, after 
25 of the experimental samples had been 
presented, and during changing of 
tapes. 

Ratings of Each Sample as a Whole. 
Preceding the experimental judging ses- 
sion for rating each sample as a whole, 
the two least severe and the two most 
severe one-minute speech samples, as 
determined by results of the Sherman 
and Morrison (5) study, were played 
for the observers as representing the 
extremes of the scale. The instructions 
for rating the speech samples were then 
given. The observers were told to rate 
each entire one-minute sample as a 
whole and to avoid, insofar as possible, 
being influenced by any factor other 
than articulation. The first five speech 
samples were then rated for practice. 
These ratings were not included in the 
experimental data. The same five one- 
minute samples were repeated for rat- 
ings at the end of the experimental 





194 Journal of Speech and Hearing Research 


Taste 1. Summaries of analyses of variance test- 
ing differences among observers with respect to 
general level of rating 50 one-minute speech 
samples, 











Source dj ms F F 05 
Consecutive Intervals 
Observers (O) 13 5.33 8.20" 2578 
Samples (8) 49 77.60 
Os 637 0.65 
Total 699 
Sample as a Whole 
Observers (O) 14 4.71 6.45* 1.75 
Samples (S) 49 94:09 
Os 686 0.73 
Total 749 








*“F = mso/msog. 


session. The entire session required ap- 
proximately one hour and 15 minutes. 
A short rest period was given after 25 
samples had been presented. 


Results and Discussion 


Scale Values. The 50 one-minute 
speech samples were scaled by 14 ob- 
servers at six consecutive 10-second 
intervals for each sample on a nine- 
point equal-appearing-intervals scale, 
with one representing least articulation 
defectivenes and nine representing most 
articulation defectiveness. Mean scale 
values of the degree of articulation de- 
fectiveness were obtained by averaging 
for each of the 50 one-minute samples 
the six responses of each of the 14 ob- 
servers. There were thus 14 mean scale 
values for each of the 50 samples or 700 
scale values in all. 

The same 50 one-minute speech 
samples were scaled also on a nine-point 
equal-appearing-intervals scale by 15 
additional observers who each gave 
one rating to each sample as a whole 


rather than consecutive ratings at spec- 
ified intervals throughout the sample. 
The 15 single ratings of each of the 50 
samples resulted in a total of 750 rat- 
ings. 


Reliability of Individual Observers. 
The intraclass correlation technique for 
evaluating the reliability of individual 
ratings as described by Ebel (1) was 
applied to each of the two sets of scale 
values. Summaries of analyses of vari- 
ance for each set, reported in Table 1, 
indicate that the observers within each 
of the two groups. differ significantly 
from each other in general level of 
rating. For this reason the formula 
which removes the between-observers 
variance in estimating the reliability 
of ratings was employed. The intra- 
class correlation coefficient for the 
group who rated at consecutive inter- 
vals is .89 (95% confidence interval: 
.84<7r<.93) and the coefficient for the 
group who rated each sample as a whole 
is also .89 (95% confidence interval: 


-.84<7r<.98).1 The scale values derived 


from the ratings of individual observers 
thus are apparently of about the same 
reliability for both methods. The high 
coefficients are good evidence that scale 
values of articulation defectiveness for 
one-minute speech samples obtained by 
either of the above methods are quite 
reliable. The usefulness of scale values 
based upon the responses of any one 
individual would, of course, depend 
upon establishing the reliability of scale 
values based upon that individual’s re- 
sponses. 

Reliability was evaluated also for 
each of the observers separately. Each 


*The intraclass correlation coefficients and 
the 95% confidence intervals were obtained 
by the formulae given by Ebel (J). 





set 
the 
co 
va 
se 








set of 50 mean scale values derived from 
the responses of a single observer was 
correlated with a set of mean scale 
values based upon the corresponding 
sets of values for the other observers in 
his group. The range of the obtained 
Pearson rs was from .91 to .96 (obtained 
rs: .91, .92, .93, .93, .94, .94, .94, .94, .95, 
95, .96, .96, .96, .96) for the group 
which rated at consecutive intervals and 
from .91 to .97 (obtained 7s: .91, .92, 
92, .93, .94, .94, .94, .95, .95, .95, .96, 
.96, .96, .97, .97) for the group which 
rated each sample as a whole. The 
placement of the means to agree with 
group means in relative positions along 
the severity dimension was thus evi- 
dently quite precise for each individual 
observer. Because of the high Pearson 
rs obtained by all of the observers and 
because of the high intraclass correla- 
tion with a fairly high limit of the 95% 
confidence interval it may be expected, 
in general, that satisfactorily reliable 
scale values of severity of articulation 
defectiveness can be obtained from rat- 
ings made by observers when their 
training, the samples, and rating condi- 
tions are similar to those of the present 
experiment. 


As noted above, observers differ sig- 
nificantly in general level of rating: 
that is, in general, observers are not 
consistent as to which portions of the 
scale they tend to use. Many of the dif- 
ferences among means for individuals 
are significant. The critical differences? 
required for significance at the 5% 
level are small: .32 for the group which 
judged at consecutive intervals and .34 
for the group which judged each 
sample as a whole. For the group which 
judged at consecutive intervals, the 





*Critical difference 


= t 5 (2ms so/s) 7. 





Sherman, Cullinan: Scaling Articulation 195 


range of the obtained 91 differences is 
from .01 to 1.02 with a mean difference 
of .39 and with 52 of the differences 
significant, and for the group which 
rated each sample as a whole the range 
of the 105 differences is from .02 to 
1.12 with a mean difference of .36 and 
with 51 of the differences significant. 


Comparison of the Three Methods. 
Pearson rs were employed for the pur- 
pose of making comparisons among the 
three methods of obtaining ratings of 
the speech samples: (a) judging seg- 
ments at consecutive intervals, (b) 
judging the one-minute sample as a 
whole, and (c) judging randomized 
segments. Available for this purpose 
were the mean scale values obtained by 
Sherman and Morrison (5) by the 
method of judging randomized seg- 
ments. The Pearson r for estimating 
the relationship between the 50 mean 
scale values derived from the judgments 
made at consecutive intervals and the 
50 mean scale values derived from the 
judgments of those who rated each 
sample as a whole is .99 with a 95% 
confidence interval of .98 to .99. The 
Pearson r for estimating the relation- 
ship between the 50 mean scale values 
derived from judgments of randomized 
segments and (a) the 50 mean scale 
values derived from the judgments made 
at consecutive intervals and also (b) 
the 50 mean scale values obtained from 
each sample as a whole was, in each 
case, .98 with a 95% confidence inter- 
val of .96 to .99. The high correlations 
and small 95% confidence intervals are 
evidence of a very strong relationship 
between any two sets of measures ob- 
tained by the three methods. It thus 
seems reasonable that, at least for many 
purposes, the time and effort involved 


196 Journal of Speech and Hearing Research 


in cutting and splicing tapes in order 
to randomize segments is unnecessary. 
Although the question, raised by Sher- 
man and Morrison (5), regarding the 
independence of judgments of articu- 
lation defectiveness at consecutive in- 
tervals is not answered by the results 


2 a2 ®@ VN @ © 
{ 


— 
= 


a oe ae ee 
% 


Y¥*.87X+.24 
Circe: 6 eo TO aay Sle 


c 





SCALE VALUES 
(RANDOMIZED SEGMENTS) 









SCALE VALUES 
(CONSECUTIVE INTERVALS) 


Figure 1. Scale values obtained by Sherman 
and Morrison (5), using the randomized seg- 
ments procedure, plotted against the corre- 
sponding scale values obtained in this study 
using the consecutive intervals procedure. 


SCALE VALUES 
(ONE— MINUTE) 
> w ¢ v= 6 @® 


no 
| el 


Y¥21.05X—-.30 





ie SE aa RE a at 7 a 
es a 





or 


SCALE VALUES 
(CONSECUTIVE INTERVALS) 


Ficure 2. Scale values obtained by the one- 
minute single-rating procedure plotted against 
the corresponding scale values obtained by 
the consecutive intervals procedure. 


9}— 
r 

a 8 
z 

w 7 
5 

ww 6 
de 

$3 5- 
N 

SZ 4- 
2° 
32 

2 3 
= 

2 






Y*.81X+.58 


PRE Tae Set Se Ne Ea 
Oirte ere to 1S se OS 








SCALE VALUES 
(ONE MINUTE) 


Ficure 3. Scale values obtained by Sherman 
and Morrison (5), using the randomized seg- 
ments procedure, plotted against the corre- 
sponding scale values obtained in this study 
using the one-minute single-rating procedure. 


of this study, the dependence or inde- 
pendence of the judgments would seem 
to be of little importance because of 
the very high correlation between the 
mean scale values obtained by judging 
the segments consecutively and the 
thean scale values obtained by judging 
randomized segments. The very high 
correlations between the mean scale 
values obtained by having each sample 
rated as a whole and the mean scale 
valucs obtained by the other two meth- 
ods is evidence that for many purposes 
the one-minute speech samples would 
not have to be broken down into seg- 
ments to obtain useful and reliable scale 
values. 


The mean scale values obtained by 
each of the two procedures were 


‘plotted against the mean scale values 


obtained by Sherman and Morrison (5) 
and also against each other. These plots, 
along with the linear regression lines 
and their corresponding equations, are 
to be seen in Figures 1, 2, and 3, show- 








Sherman, Cullinan: Scaling Articulation 


ing mean scale values for procedures 
of (a) randomized segments and con- 
secutive intervals, (b) one-minute 
single rating and consecutive intervals, 
and (c) randomized segments and one- 
minute single rating, respectively. The 
range of all deviations from the three 
regression lines is from .00 to 1.15, with 
only three deviations larger than 1.00 
and with mean deviations from the 
three lines of .33, .27, and .35, respec- 
tively; the corresponding standard er- 
rors of estimate are .41, .35, and .41. 
Although, as previously mentioned, in- 
dividuals varied significantly in general 
level of rating, the regression lines and 
their equations demonstrate that cor- 
responding mean scale values obtained 
by the three procedures are, in general, 
approximately equal and that approxi- 
mately the entire scale is employed in 
all three instances. The biggest differ- 
ence is between the sets of scale values 
for the procedures of randomized seg- 
ments and one-minute single rating. The 
range of scale values is comparatively 
restricted at both ends of the scale, but 
particularly at the upper end, for the 
randomized segments procedure. Even 
here, however, the difference is not 
large. 


Segment Influence. The median scale 
value for each of the six 10-second 
segments for each of the 50 one-minute 
speech samples was obtained from the 
responses of the 14 observers by the 
method described by Thurstone and 
Chave (6). Three sets of 50 mean scale 
values were obtained by taking the 
means of the medians of the first and 
second segments, of the third and fourth 
segments, and of the fifth and sixth 
segments for each of the 50 speech 
samples. The Pearson rs resulting from 


197 


correlating each of these three sets of 
scale values with mean scale values de- 
rived from the 15 single ratings of each 
sample as a whole were .97, .98, and .98, 
respectively. The set of 50 median scale 
values for each of the six 10-second 
segments was then correlated with the 
mean scale values derived from the 15 
single ratings of each sample as a whole. 
The obtained Pearson rs are .96, .96, 
.97, .97, .98, and .98 for the first through 
the sixth segments, respectively. The 
six segments thus apparently have about 
equal influence on the over-all rating 
given to a one-minute speech sample. 
Fairly reliable estimates of articulation 
defectiveness of one-minute speech 
samples can thus be obtained by psy- 
chological scaling of a 10-second seg- 
ment from each sample. For many 
purposes, however, samples longer than 
10 seconds would of course be neces- 
sary: for example, 10 seconds of speech 
is unlikely to be sufficient for a com- 
plete analysis of articulation errors. 


Summary 


Measures of articulation defective- 
ness were obtained for 50 one-minute 
tape-recorded samples of children’s 
speech. Using a nine-point equal-ap- 
pearing-intervals scale, 14 observers 
rated consecutive 10-second segments 
of each sample; a mean scale value was 
computed for each sample for each 
observer. An additional 15 observers 
rated each sample once as a whole. 
For both procedures measures based 
upon individual observer responses were 
satisfactorily reliable. 

As measured by correlation between 
group means, results of the two pro- 
cedures are similar. Results also agree 
closely with those of a previous experi- 





198 Journal of Speech and Hearing Research 


ment by which scale values for the 
same 50 samples were obtained from 
observer responses to randomized 10- 
second segments with no two segments 
from the same sample presented con- 
secutively. 

Each of six sets of median scale 
values for 10-second segments (14 ob- 
servers) is highly correlated with the 
set of means of responses to each sam- 
ple as a whole (15 observers). 


References 


1, Eset, R. L., Estimation of the reliability 
of ratings. Psychometrika, 16, 1951, 407- 
424. 


zy 


Morrison, Sueita, Measuring the severity 
of articulation defectiveness. J. Speech 
Hearing Dis., 20, 1955, 347-351. 


. SHERMAN, Dorotny, Reliability and utility 


of individual ratings of severity of au- 
dible characteristics of stuttering. J. 
Speech Hearing Dis., 20, 1955, 11-16. 


. SHERMAN, DorotHy, and Mooprr, CatH- 


ERINE E., Four psychological scaling meth- 
ods applied to articulation defectiveness. 
J. Speech Hearing Dis., 22, 1957, 698-706. 


. SHERMAN, DorotHy, and Morrison, SHEILA, 


Reliability of individual ratings of sever- 
ity of defective articulation. J. Speech 
Hearing Dis., 20, 1955, 352-358. 


. Tuurstong, L. L., and Cnave, E. J., The 


Measurement of Attitude. Chicago: Univ. 
Chicago Press, 1929. 





